History log of /netbsd-current/sys/dev/pci/if_vioif.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.111 21-Mar-2024 isaki

Ensure that the number of bus_dma segments doesn't exceed VirtIO queue size.
This fixes reproducible panics when the host's VirtIO queue size is too small,
less than or equal to VIRTIO_NET_TX_MAXNSEGS(=16).
PR kern/58049.


# 1.110 09-Feb-2024 andvar

fix spelling mistakes, mainly in comments and log messages.


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.109 13-May-2023 andvar

fix typos in comments.


# 1.108 11-May-2023 yamaguchi

Fix missing check for netq->netq_stopping in vioif_rx_intr()

Reported-by: syzbot+5120b7a1f97a3f5ca052@syzkaller.appspotmail.com
https://syzkaller.appspot.com/bug?id=243cf4115808e49774a49294f63200770399660b


# 1.107 27-Mar-2023 nakayama

Use PRIuBUSSIZE to print bus_size_t variables.


# 1.106 24-Mar-2023 yamaguchi

vioif(4): fix wrong memory allocation size


# 1.105 23-Mar-2023 yamaguchi

vioif(4): clear flags when configure is failed


# 1.104 23-Mar-2023 yamaguchi

Added functions to set interrupt handler and index into virtqueue


# 1.103 23-Mar-2023 yamaguchi

Set virtqueues in virtio_child_attach_finish

The number of virtqueue maybe change in a part of VirtIO devices
(e.g. vioif(4)). And it is fixed after negotiation of features.
So the configuration is moved into the function.


# 1.102 23-Mar-2023 yamaguchi

vioif(4): divide IFF_OACTIVE into per-queue


# 1.101 23-Mar-2023 yamaguchi

vioif(4): reorganize functions

iThis change is move of function and rename,
and this is no functional change.


# 1.100 23-Mar-2023 yamaguchi

vioif(4): rename sc_hdr_segs to sc_segs


# 1.99 23-Mar-2023 yamaguchi

vioif(4): added functions to manipulate network queues


# 1.98 23-Mar-2023 yamaguchi

vioif(4): added new data structure for network queues

and moved the same parameters in vioif_txqueue and
vioif_rxqueue into the new structure


# 1.97 23-Mar-2023 yamaguchi

vioif(4): added __predct_false to error check


# 1.96 23-Mar-2023 yamaguchi

vioif(4): prepare slot before dequeuing


# 1.95 23-Mar-2023 yamaguchi

vioif(4): added a structure to manage variables for packet processings


# 1.94 23-Mar-2023 yamaguchi

vioif(4): increase output error counter


# 1.93 23-Mar-2023 yamaguchi

vioif(4): merge drain into clear of queue


# 1.92 23-Mar-2023 yamaguchi

vioif(4): divide interrupt handler for receiving
into dequeuing and preparing of buffers


# 1.91 23-Mar-2023 yamaguchi

vioif(4): drain receive buffer on stopping the device
to remove branch in vioif_populate_rx_mbufs_locked()


# 1.90 23-Mar-2023 yamaguchi

vioif(4): fix missing virtio_enqueue_abort for error handling


# 1.89 23-Mar-2023 yamaguchi

vioif(4): added event counters related to receive processing


# 1.88 23-Mar-2023 yamaguchi

vioif(4): adjust receive buffer to ETHER_ALIGN


# 1.87 23-Mar-2023 yamaguchi

vioif(4): stop interrupt before schedule handler


# 1.86 23-Mar-2023 yamaguchi

vioif(4): rename {txq,rxq}_active to {txq,rxq}_running_handle


# 1.85 23-Mar-2023 yamaguchi

vioif(4): use device reset to stop interrupt completely


# 1.84 23-Mar-2023 yamaguchi

vioif(4): access to txq_active and rxq_active with lock held


# 1.83 23-Mar-2023 yamaguchi

vioif(4): remove unnecessary lock release

if_percpuq_enqueue() can call with rxq->rxq_lock held because of per-cpu.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.82 12-Sep-2022 knakahara

branches: 1.82.4;
Uniform vioif's link status to if_link_state. Implemented by yamaguchi@n.o.

Let vioif(4) know LINK_STATE_UNKNOWN.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.110 09-Feb-2024 andvar

fix spelling mistakes, mainly in comments and log messages.


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.109 13-May-2023 andvar

fix typos in comments.


# 1.108 11-May-2023 yamaguchi

Fix missing check for netq->netq_stopping in vioif_rx_intr()

Reported-by: syzbot+5120b7a1f97a3f5ca052@syzkaller.appspotmail.com
https://syzkaller.appspot.com/bug?id=243cf4115808e49774a49294f63200770399660b


# 1.107 27-Mar-2023 nakayama

Use PRIuBUSSIZE to print bus_size_t variables.


# 1.106 24-Mar-2023 yamaguchi

vioif(4): fix wrong memory allocation size


# 1.105 23-Mar-2023 yamaguchi

vioif(4): clear flags when configure is failed


# 1.104 23-Mar-2023 yamaguchi

Added functions to set interrupt handler and index into virtqueue


# 1.103 23-Mar-2023 yamaguchi

Set virtqueues in virtio_child_attach_finish

The number of virtqueue maybe change in a part of VirtIO devices
(e.g. vioif(4)). And it is fixed after negotiation of features.
So the configuration is moved into the function.


# 1.102 23-Mar-2023 yamaguchi

vioif(4): divide IFF_OACTIVE into per-queue


# 1.101 23-Mar-2023 yamaguchi

vioif(4): reorganize functions

iThis change is move of function and rename,
and this is no functional change.


# 1.100 23-Mar-2023 yamaguchi

vioif(4): rename sc_hdr_segs to sc_segs


# 1.99 23-Mar-2023 yamaguchi

vioif(4): added functions to manipulate network queues


# 1.98 23-Mar-2023 yamaguchi

vioif(4): added new data structure for network queues

and moved the same parameters in vioif_txqueue and
vioif_rxqueue into the new structure


# 1.97 23-Mar-2023 yamaguchi

vioif(4): added __predct_false to error check


# 1.96 23-Mar-2023 yamaguchi

vioif(4): prepare slot before dequeuing


# 1.95 23-Mar-2023 yamaguchi

vioif(4): added a structure to manage variables for packet processings


# 1.94 23-Mar-2023 yamaguchi

vioif(4): increase output error counter


# 1.93 23-Mar-2023 yamaguchi

vioif(4): merge drain into clear of queue


# 1.92 23-Mar-2023 yamaguchi

vioif(4): divide interrupt handler for receiving
into dequeuing and preparing of buffers


# 1.91 23-Mar-2023 yamaguchi

vioif(4): drain receive buffer on stopping the device
to remove branch in vioif_populate_rx_mbufs_locked()


# 1.90 23-Mar-2023 yamaguchi

vioif(4): fix missing virtio_enqueue_abort for error handling


# 1.89 23-Mar-2023 yamaguchi

vioif(4): added event counters related to receive processing


# 1.88 23-Mar-2023 yamaguchi

vioif(4): adjust receive buffer to ETHER_ALIGN


# 1.87 23-Mar-2023 yamaguchi

vioif(4): stop interrupt before schedule handler


# 1.86 23-Mar-2023 yamaguchi

vioif(4): rename {txq,rxq}_active to {txq,rxq}_running_handle


# 1.85 23-Mar-2023 yamaguchi

vioif(4): use device reset to stop interrupt completely


# 1.84 23-Mar-2023 yamaguchi

vioif(4): access to txq_active and rxq_active with lock held


# 1.83 23-Mar-2023 yamaguchi

vioif(4): remove unnecessary lock release

if_percpuq_enqueue() can call with rxq->rxq_lock held because of per-cpu.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.82 12-Sep-2022 knakahara

branches: 1.82.4;
Uniform vioif's link status to if_link_state. Implemented by yamaguchi@n.o.

Let vioif(4) know LINK_STATE_UNKNOWN.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.109 13-May-2023 andvar

fix typos in comments.


# 1.108 11-May-2023 yamaguchi

Fix missing check for netq->netq_stopping in vioif_rx_intr()

Reported-by: syzbot+5120b7a1f97a3f5ca052@syzkaller.appspotmail.com
https://syzkaller.appspot.com/bug?id=243cf4115808e49774a49294f63200770399660b


# 1.107 27-Mar-2023 nakayama

Use PRIuBUSSIZE to print bus_size_t variables.


# 1.106 24-Mar-2023 yamaguchi

vioif(4): fix wrong memory allocation size


# 1.105 23-Mar-2023 yamaguchi

vioif(4): clear flags when configure is failed


# 1.104 23-Mar-2023 yamaguchi

Added functions to set interrupt handler and index into virtqueue


# 1.103 23-Mar-2023 yamaguchi

Set virtqueues in virtio_child_attach_finish

The number of virtqueue maybe change in a part of VirtIO devices
(e.g. vioif(4)). And it is fixed after negotiation of features.
So the configuration is moved into the function.


# 1.102 23-Mar-2023 yamaguchi

vioif(4): divide IFF_OACTIVE into per-queue


# 1.101 23-Mar-2023 yamaguchi

vioif(4): reorganize functions

iThis change is move of function and rename,
and this is no functional change.


# 1.100 23-Mar-2023 yamaguchi

vioif(4): rename sc_hdr_segs to sc_segs


# 1.99 23-Mar-2023 yamaguchi

vioif(4): added functions to manipulate network queues


# 1.98 23-Mar-2023 yamaguchi

vioif(4): added new data structure for network queues

and moved the same parameters in vioif_txqueue and
vioif_rxqueue into the new structure


# 1.97 23-Mar-2023 yamaguchi

vioif(4): added __predct_false to error check


# 1.96 23-Mar-2023 yamaguchi

vioif(4): prepare slot before dequeuing


# 1.95 23-Mar-2023 yamaguchi

vioif(4): added a structure to manage variables for packet processings


# 1.94 23-Mar-2023 yamaguchi

vioif(4): increase output error counter


# 1.93 23-Mar-2023 yamaguchi

vioif(4): merge drain into clear of queue


# 1.92 23-Mar-2023 yamaguchi

vioif(4): divide interrupt handler for receiving
into dequeuing and preparing of buffers


# 1.91 23-Mar-2023 yamaguchi

vioif(4): drain receive buffer on stopping the device
to remove branch in vioif_populate_rx_mbufs_locked()


# 1.90 23-Mar-2023 yamaguchi

vioif(4): fix missing virtio_enqueue_abort for error handling


# 1.89 23-Mar-2023 yamaguchi

vioif(4): added event counters related to receive processing


# 1.88 23-Mar-2023 yamaguchi

vioif(4): adjust receive buffer to ETHER_ALIGN


# 1.87 23-Mar-2023 yamaguchi

vioif(4): stop interrupt before schedule handler


# 1.86 23-Mar-2023 yamaguchi

vioif(4): rename {txq,rxq}_active to {txq,rxq}_running_handle


# 1.85 23-Mar-2023 yamaguchi

vioif(4): use device reset to stop interrupt completely


# 1.84 23-Mar-2023 yamaguchi

vioif(4): access to txq_active and rxq_active with lock held


# 1.83 23-Mar-2023 yamaguchi

vioif(4): remove unnecessary lock release

if_percpuq_enqueue() can call with rxq->rxq_lock held because of per-cpu.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.82 12-Sep-2022 knakahara

branches: 1.82.4;
Uniform vioif's link status to if_link_state. Implemented by yamaguchi@n.o.

Let vioif(4) know LINK_STATE_UNKNOWN.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.107 27-Mar-2023 nakayama

Use PRIuBUSSIZE to print bus_size_t variables.


# 1.106 24-Mar-2023 yamaguchi

vioif(4): fix wrong memory allocation size


# 1.105 23-Mar-2023 yamaguchi

vioif(4): clear flags when configure is failed


# 1.104 23-Mar-2023 yamaguchi

Added functions to set interrupt handler and index into virtqueue


# 1.103 23-Mar-2023 yamaguchi

Set virtqueues in virtio_child_attach_finish

The number of virtqueue maybe change in a part of VirtIO devices
(e.g. vioif(4)). And it is fixed after negotiation of features.
So the configuration is moved into the function.


# 1.102 23-Mar-2023 yamaguchi

vioif(4): divide IFF_OACTIVE into per-queue


# 1.101 23-Mar-2023 yamaguchi

vioif(4): reorganize functions

iThis change is move of function and rename,
and this is no functional change.


# 1.100 23-Mar-2023 yamaguchi

vioif(4): rename sc_hdr_segs to sc_segs


# 1.99 23-Mar-2023 yamaguchi

vioif(4): added functions to manipulate network queues


# 1.98 23-Mar-2023 yamaguchi

vioif(4): added new data structure for network queues

and moved the same parameters in vioif_txqueue and
vioif_rxqueue into the new structure


# 1.97 23-Mar-2023 yamaguchi

vioif(4): added __predct_false to error check


# 1.96 23-Mar-2023 yamaguchi

vioif(4): prepare slot before dequeuing


# 1.95 23-Mar-2023 yamaguchi

vioif(4): added a structure to manage variables for packet processings


# 1.94 23-Mar-2023 yamaguchi

vioif(4): increase output error counter


# 1.93 23-Mar-2023 yamaguchi

vioif(4): merge drain into clear of queue


# 1.92 23-Mar-2023 yamaguchi

vioif(4): divide interrupt handler for receiving
into dequeuing and preparing of buffers


# 1.91 23-Mar-2023 yamaguchi

vioif(4): drain receive buffer on stopping the device
to remove branch in vioif_populate_rx_mbufs_locked()


# 1.90 23-Mar-2023 yamaguchi

vioif(4): fix missing virtio_enqueue_abort for error handling


# 1.89 23-Mar-2023 yamaguchi

vioif(4): added event counters related to receive processing


# 1.88 23-Mar-2023 yamaguchi

vioif(4): adjust receive buffer to ETHER_ALIGN


# 1.87 23-Mar-2023 yamaguchi

vioif(4): stop interrupt before schedule handler


# 1.86 23-Mar-2023 yamaguchi

vioif(4): rename {txq,rxq}_active to {txq,rxq}_running_handle


# 1.85 23-Mar-2023 yamaguchi

vioif(4): use device reset to stop interrupt completely


# 1.84 23-Mar-2023 yamaguchi

vioif(4): access to txq_active and rxq_active with lock held


# 1.83 23-Mar-2023 yamaguchi

vioif(4): remove unnecessary lock release

if_percpuq_enqueue() can call with rxq->rxq_lock held because of per-cpu.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.82 12-Sep-2022 knakahara

branches: 1.82.4;
Uniform vioif's link status to if_link_state. Implemented by yamaguchi@n.o.

Let vioif(4) know LINK_STATE_UNKNOWN.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.82 12-Sep-2022 knakahara

Uniform vioif's link status to if_link_state. Implemented by yamaguchi@n.o.

Let vioif(4) know LINK_STATE_UNKNOWN.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.81 04-May-2022 simonb

White space KNF nits.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.80 16-Apr-2022 andvar

fix various typos in comments and log messages.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.79 13-Apr-2022 uwe

virtio: use the new syntax for snprintb(3) format strings.

The old syntax is limited to 32 bits only (and has 1-based bit numbers
which is rather incovenient too).


# 1.78 13-Apr-2022 yamaguchi

vioif(4): issue VIRTIO_NET_CTRL_MAC_ADDR_SET command only when
VIRTIO_NET_F_CTRL_MAC_ADDR is negotiated


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.77 31-Mar-2022 yamaguchi

vioif(4): remove unnecessary lock acquirement

The lock was hold to wait for completion of interrupt handlers.
But, they are already stopped by rxq_stopping and txq_stopping
flags.

pointed out by riastradh@n.o, thanks.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.76 29-Mar-2022 yamaguchi

vioif(4): Added a comment about stopping packet processing


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.75 24-Mar-2022 yamaguchi

vioif(4): adopt ether_set_ifflags_cb


# 1.74 24-Mar-2022 yamaguchi

vioif(4): register MAC address to a device


# 1.73 24-Mar-2022 yamaguchi

vioif(4): fix missing error handling


# 1.72 24-Mar-2022 yamaguchi

vioif(4): do not schedule packet processing while stopping the device


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.71 28-Oct-2021 yamaguchi

virtio: stop reinit for safety when a device resetting is failed


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


# 1.65 28-May-2020 riastradh

branches: 1.65.2;
Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.70 08-Feb-2021 skrll

Trailing whitespace


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


Revision tags: thorpej-futex-base
# 1.65 28-May-2020 riastradh

Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.69 03-Feb-2021 reinoud

Oops, made a mistake in my last commit


# 1.68 03-Feb-2021 reinoud

Allocate enough space for the bus_dmamap_t arrays for rxq_hdr_dmamaps[] and
txq_hdr_maps[]


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


Revision tags: thorpej-futex-base
# 1.65 28-May-2020 riastradh

Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.67 31-Jan-2021 reinoud

Although the header structure can be smaller, the headers *are* indexed as if
they are full sized so allocate enough memory so the indexing works as
expected and we are not scribbling outside bounds.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


Revision tags: thorpej-futex-base
# 1.65 28-May-2020 riastradh

Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.66 20-Jan-2021 reinoud

Add VirtIO PCI v1.0 attachments and fix the drivers affected.

The vioif, ld, scsi, viornd and viomb devices were adjusted when needed and
tested both in legacy 0.9 and v1.0 attachments trough PCI on amd64, sparc64,
aarch64 and aarch64-eb. ACPI/FDT attachments also tested on
aarch64/aarch64-eb.

Known issues

* viomb on aarch64 works only with ACPI/FDT attachment but not with PCI
attachment. PCI and ACPI/FDT attachment works on aarch64-eb.

* virtio on sparc64 attaches but is it not functioning though not a
regression.


Revision tags: thorpej-futex-base
# 1.65 28-May-2020 riastradh

Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.65 28-May-2020 riastradh

Allocate proper storage for the event counter group names.

Can't use a stack buffer for these because the evcnt remembers the
pointer!


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.64 25-May-2020 yamaguchi

Stop all processing related to rx before that related to tx for safety


# 1.63 25-May-2020 yamaguchi

Use evcnt(9) to record error status in vioif(4)


# 1.62 25-May-2020 yamaguchi

Introduce the lock for vioif_softc to avoid a race condition
in vioif_update_link_status()

The function is called in both vioif_init() and softint.


# 1.61 25-May-2020 yamaguchi

Populate mbufs in the packet receiving process, not in a softint


# 1.60 25-May-2020 yamaguchi

Always hold tx lock in deferred transmit to send all packets

There may be packets that enqueued before another transmission
releases the lock after finish of its transmission.
When using mutex_try_enter(), vioif_deferred_transmit() can not
sends them.

pointed out by knakahara@n.o


# 1.59 25-May-2020 yamaguchi

Remove redundant checks.
There is the same check in vioif_send_common_locked()


# 1.58 25-May-2020 yamaguchi

Replace macros with static functions for refactoring


# 1.57 25-May-2020 yamaguchi

Fix typo in comments


# 1.56 25-May-2020 yamaguchi

Fix the wrong segment size in vioif(4)


# 1.55 25-May-2020 yamaguchi

Introduce packet handling in softint or kthread for vioif(4)


# 1.54 25-May-2020 yamaguchi

Set handlers implemented in child device of virtio(4) to virtqueue
instead of the commonized function


# 1.53 25-May-2020 yamaguchi

Obsolete VIOIF_SOFTINT_INTR

The kernel option is introduced to realize softint-based if_input.
Since the same scheme has been implemented in if_percpuq_enqueue(),
the option is no longer needed.

pointed out by ozaki-r@n.o.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

branches: 1.51.2;
in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.52 30-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.51 01-Oct-2019 chs

in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.51 01-Oct-2019 chs

in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.50 14-Sep-2019 christos

- KNF
- fix typo in error message
- use aprint* everywhere
- use loops to initialize mac
- remove unused variables


Revision tags: netbsd-9-base phil-wifi-20190609
# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

branches: 1.41.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.49 23-May-2019 msaitoh

Whitespace fix (mainly tabify).


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.48 23-May-2019 msaitoh

-No functional change:
- Simplify struct ethercom's pointer near ETHER_FIRST_MULTI().
- Simplify MII structure initialization.
- u_int*_t -> uint*_t.
- KNF


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


Revision tags: isaki-audio2-base
# 1.47 04-Feb-2019 yamaguchi

Do not call virtio_start_vq_intr() for ctrlq
unless the iface has a control queue


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118
# 1.46 14-Jan-2019 yamaguchi

Add multiqueue support, vioif(4)


# 1.45 14-Jan-2019 yamaguchi

Set IFEF_MPSAFE flag


# 1.44 14-Jan-2019 yamaguchi

Functionize the same code related to ctrl vq in vioif(4)


# 1.43 14-Jan-2019 yamaguchi

Divide some elements of vioif_softc into txq, rxq, and ctrlq


# 1.42 14-Jan-2019 yamaguchi

Make macros not depend on vioif_softc


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base
# 1.41 26-Jun-2018 msaitoh

Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625
# 1.40 10-Jun-2018 jakllsch

remove irrelevant pci(9) #includes from virtio child drivers


Revision tags: pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.39 08-Feb-2018 dholland

branches: 1.39.2;
Typos.


Revision tags: netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.39 08-Feb-2018 dholland

Typos.


Revision tags: tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.38 01-Jun-2017 chs

remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.


Revision tags: prg-localcount2-base3
# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

branches: 1.34.4;
Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.37 17-May-2017 jdolecek

more precise m_freem() on error paths, and update m after the m_defrag() call


# 1.36 17-May-2017 jdolecek

simplify vioif_start() - remove the delivery attempts on failure and retries,
leave that for the dedicated thread

if dma map load fails, retry after m_defrag(), but continue processing
other queue items regardless

set interface queue length according to the length of virtio queue, so that
higher layer won't queue more than interface can manage to keep in flight

use the mutexes always, not just with NET_MPSAFE, so they continue
being exercised and hence working; they also enforce proper IPL level

inspired by discussion around PR kern/52211, thanks to Masanobu SAITOH
for the m_defrag() idea and code


# 1.35 17-May-2017 jdolecek

do not set IFF_OACTIVE if dma map load or the virtio reserve fails;
this causes interface to ignore any further TX requests if this happens
when there are no other TX requests in progress

fixes kern/52211 by Juergen Hannken-Illjes


Revision tags: prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


Revision tags: prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.34 28-Mar-2017 ozaki-r

Handle config change interrupts to inhibit sending packets while link down

PR kern/52103 by s-yamaguchi@IIJ


# 1.33 28-Mar-2017 ozaki-r

Don't write to read-only VIRTIO_NET_S_LINK_UP bit

The bit is defined as read-only in the Virtio PCI Card Specification.
The fix is inspired by FreeBSD.

PR kern/52103 by s-yamaguchi@IIJ


# 1.32 25-Mar-2017 jdolecek

reorganize the attachment process for virtio child devices, so that
more common code is shared among the drivers, and it's possible for
the drivers to be correctly dynamically loaded; forbid direct access
to struct virtio_softc from the child driver code


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

branches: 1.30.2;
Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.31 17-Jan-2017 ozaki-r

Fix unlocking in vioif_rx_filter


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.30 28-Dec-2016 ozaki-r

Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.


# 1.30 28-Dec-2016 ozaki-r

Protect ec_multi* with mutex

The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.


# 1.29 15-Dec-2016 ozaki-r

Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input

The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net


# 1.28 08-Dec-2016 ozaki-r

Apply deferred if_start framework

if_schedule_deferred_start checks if the if_snd queue contains packets,
so drivers don't need to check it by themselves.


Revision tags: nick-nhusb-base-20161204
# 1.27 29-Nov-2016 uwe

vioif_start() - do not call virtio_enqueue_abort() after error from
virtio_enqueue_reserve(), as it's already done by the latter, so we
ended up with a kind of "double free" that messed up out free list of
vq_entry's.

This is even documented in a "typical usage" comment in virtio.c (and
those quotes are not intended to be sarcastic).

PR 51132 - virtio net device stuck for UDP burst transmission


Revision tags: pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.26 27-Sep-2016 pgoyette

Modularize the ld driver and all of its attachments. Ensure that all
parents are capable of rescan (or otherwise provide a means of attaching
children post-initialization).


Revision tags: localcount-20160914
# 1.25 29-Aug-2016 ozaki-r

Fix initializing wrong queues

Pointed out by Mike Larkin.

PR kern/51448


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.24 10-Jun-2016 ozaki-r

branches: 1.24.2;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.23 17-May-2016 pooka

Try to get more packets going if the transmit interrupt indicates
some were sent. Doing so avoids a situation where vioif_start never
gets called in case the sendqueue fills up and therefore the interface
perpetually drops all packets due to the queue being full.
(not sure why all drivers need to do this themselves; just keeping
up with the joneses)

Problem reported and patch tested by jmmlmendes and yasukata at
repo.rumpkernel.org/rumprun


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.22 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


# 1.21 10-Jan-2016 christos

PR/50636: Ryo ONODERA: Reduce memory use


Revision tags: nick-nhusb-base-20151226
# 1.20 29-Oct-2015 christos

simplify


# 1.19 29-Oct-2015 ozaki-r

Name virtqueue index


# 1.18 27-Oct-2015 christos

- Print the negotiated feature bits.
- Use aprint_error_dev on error, instead of printf
- Add missing abort call.


# 1.17 26-Oct-2015 ozaki-r

Support MSI-X in virtio

Currently only vioif(4) uses the feature.

knakahara@ helped to migrate to pci_intr_alloc(9). Thanks!


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606
# 1.16 05-May-2015 ozaki-r

Use NULL for initialization of sc_config_change


Revision tags: nick-nhusb-base-20150406
# 1.15 16-Jan-2015 ozaki-r

Introduce defflag for NET_MPSAFE


# 1.14 25-Dec-2014 ozaki-r

Reuse mbuf when retrying in vioif_start

Otherwise, the old mbuf will leak.


# 1.13 24-Dec-2014 ozaki-r

Take TX/RX locks when sc_stopping = true in if_stop

Taking the locks is needed to ensure ongoing TX/RX operations finish.
Otherwise, if_stop may run during TX/RX operations.


# 1.12 19-Dec-2014 ozaki-r

Implement softint-based interrupt handling in if_vioif

Softint-based interrupt handling is considered as a future direction
of the (network) device driver architecture in NetBSD. pq3etsec of
ppc is already implemented based on the architecture (unlike pq3etsec,
this change doesn't include softint-based if_start). In this
architecture, a hardware interrupt handler just schedules a softint
and the softint performs actual interrupt processing. It reduces
processing in hardware interrupt context and allows Layer 2 network
stack (e.g., bridge, vlan and even bpf) run in softint context,
which makes it easy to implement fine-grain locking in the layer.

This is an experimental implementation of the architecture in if_viof.

virtio introduces a new flag VIRTIO_F_PCI_INTR_SOFTINT. If a driver
of virtio sets it to sc_flags of virtio_softc, virtio calls
softint_schedule in virtio_intr instead of directly calling the
interrupt handler of the driver.

When VIOIF_SOFTINT_INTR is on, vioif doesn't use the existing softint
(vioif_rx_softint) that is called from vioif_rx_vq_done. Because
vioif_rx_softint already runs in softint context and another softint
isn't needed. This change actually improves performance in some cases.

The feature is disabled by default and enabled when SOFTINT_INTR is
set somewhere (normally in a kernel configuration).


Revision tags: nick-nhusb-base
# 1.11 09-Oct-2014 ozaki-r

branches: 1.11.2;
Add ETHERCAP_VLAN_MTU capability to vioif


# 1.10 08-Oct-2014 ozaki-r

Add missing semicolon


# 1.9 08-Oct-2014 ozaki-r

Don't turn promisc off in vioif_deferred_init if already configured as promisc


# 1.8 13-Aug-2014 pooka

Don't use config_deferred_interrupts() for vioif_deferred_init(),
just run it once as part of if_init(). The problem with the former
is that it will execute the deferred init routine in-place when !cold,
and since vioif_deferred_init() finishing depends on virtio interrupts
which are established only after config_deferred_interrupts() is called,
the vioif attach method would deadlock when !cold.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.7 22-Jul-2014 ozaki-r

branches: 1.7.2;
Make if_vioif MPSAFE

- Introduce VIOIF_MPSAFE
- It's enabled only when NET_MPSAFE is defined in if.h or the kernel config
- Add tx and rx mutex locks
- Locking them is performance sensitive, so it's not used when !VIOIF_MPSAFE
- Set SOFTINT_MPSAFE to vioif_rx_softint only when VIOIF_MPSAFE


# 1.6 22-Jul-2014 ozaki-r

Introduce VIRTIO_F_PCI_INTR_MPSAFE for virtio

It is set by a child driver, e.g., if_vioif. If set, virtio sets
PCI_INTR_MPSAFE for pci_intr_establish.


# 1.5 18-Jul-2014 ozaki-r

Don't set SOFTINT_MPSAFE to vioif_rx_softint

vioif_rx_softint calls vioif_populate_rx_mbufs that is not MPSAFE.
vioif_populate_rx_mbufs is also called via vioif_ioctl and so can
be called by two LWPs simultaneously, resulting in kernel panic.

PR kern/49007


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.4 09-May-2013 minoura

branches: 1.4.6;
Fix a typo, and remove an unused member.
This should fix the problem that recent Qemu dies during configuring a vioif.


# 1.3 30-Mar-2013 christos

remove trailing whitespace


Revision tags: netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base
# 1.2 19-Nov-2011 jmcneill

branches: 1.2.6; 1.2.8; 1.2.12; 1.2.14;
fix build when ALTQ is defined


Revision tags: yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.1 30-Oct-2011 hannken

branches: 1.1.2;
Import of the virtio driver written by MINOURA Makoto <minoura@netbsd.org>
with minor changes to make it compile an run on -current. This driver
speeds up disk and network access in virtual environments like KVM.

Enabled on i386 and amd64. Tested with a CentOS 5.7 x86_64 host.

See http://ozlabs.org/~rusty/virtio-spec/virtio.pdf for the specification.