#
1.120 |
|
24-May-2024 |
jsg |
remove unneeded includes; ok miod@
|
Revision tags: OPENBSD_7_5_BASE
|
#
1.119 |
|
10-Nov-2023 |
bluhm |
Make ifq and ifiq interface MP safe.
Rename ifq_set_maxlen() to ifq_init_maxlen(). This function neither uses WRITE_ONCE() nor a mutex and is called before the ifq mutex is initialized. The new name expresses that it should be used only during interface attach when there is no concurrency.
Protect ifq_len(), ifq_empty(), ifiq_len(), and ifiq_empty() with READ_ONCE(). They can be used without lock as they only read a single integer.
OK dlg@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.118 |
|
14-Jul-2023 |
claudio |
struct sleep_state is no longer used, remove it. Also remove the priority argument to sleep_finish() the code can use the p_flag P_SINTR flag to know if the signal check is needed or not. OK cheloha@ kettenis@ mpi@
|
#
1.117 |
|
28-Jun-2023 |
claudio |
First step at removing struct sleep_state.
Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state.
The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal.
OK kettenis@
|
Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
|
#
1.116 |
|
11-Mar-2022 |
mpi |
Constify struct cfattach.
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.119 |
|
10-Nov-2023 |
bluhm |
Make ifq and ifiq interface MP safe.
Rename ifq_set_maxlen() to ifq_init_maxlen(). This function neither uses WRITE_ONCE() nor a mutex and is called before the ifq mutex is initialized. The new name expresses that it should be used only during interface attach when there is no concurrency.
Protect ifq_len(), ifq_empty(), ifiq_len(), and ifiq_empty() with READ_ONCE(). They can be used without lock as they only read a single integer.
OK dlg@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.118 |
|
14-Jul-2023 |
claudio |
struct sleep_state is no longer used, remove it. Also remove the priority argument to sleep_finish() the code can use the p_flag P_SINTR flag to know if the signal check is needed or not. OK cheloha@ kettenis@ mpi@
|
#
1.117 |
|
28-Jun-2023 |
claudio |
First step at removing struct sleep_state.
Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state.
The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal.
OK kettenis@
|
Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
|
#
1.116 |
|
11-Mar-2022 |
mpi |
Constify struct cfattach.
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.118 |
|
14-Jul-2023 |
claudio |
struct sleep_state is no longer used, remove it. Also remove the priority argument to sleep_finish() the code can use the p_flag P_SINTR flag to know if the signal check is needed or not. OK cheloha@ kettenis@ mpi@
|
#
1.117 |
|
28-Jun-2023 |
claudio |
First step at removing struct sleep_state.
Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state.
The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal.
OK kettenis@
|
Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
|
#
1.116 |
|
11-Mar-2022 |
mpi |
Constify struct cfattach.
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.117 |
|
28-Jun-2023 |
claudio |
First step at removing struct sleep_state.
Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state.
The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal.
OK kettenis@
|
Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
|
#
1.116 |
|
11-Mar-2022 |
mpi |
Constify struct cfattach.
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.116 |
|
11-Mar-2022 |
mpi |
Constify struct cfattach.
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.115 |
|
08-Feb-2021 |
mpi |
Simplify sleep_setup API to two operations in preparation for splitting the SCHED_LOCK().
Putting a thread on a sleep queue is reduce to the following:
sleep_setup(); /* check condition or release lock */ sleep_finish();
Previous version ok cheloha@, jmatthew@, ok claudio@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.114 |
|
17-Jan-2021 |
dlg |
this hardware is fine with BUS_DMA_64BIT mappings.
this raises performance of tcpbench on an m3000 from ~3kpps and ~8MB/s to ~70kpps and ~191MB/s when transmitting, and ~10kpps and ~15MB/s to ~120kpps and 174MB/s when receiving.
i also tested this on a v245 and an m4000 a while back.
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.113 |
|
12-Dec-2020 |
jan |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.112 |
|
27-Nov-2020 |
kevlo |
Add initialization of sc_sff_lock rwlock.
ok semarie@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.111 |
|
17-Jul-2020 |
dlg |
name the rx rings so systat mb shows them.
|
#
1.110 |
|
17-Jul-2020 |
dlg |
add kstats to myx.
myx is unusually minimal, so there's not a lot of information that the chip provides. the most interesting is the number of packets the chip drops cos of a lack of space on the rx rings.
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.109 |
|
10-Jul-2020 |
patrick |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.108 |
|
03-Jul-2019 |
dlg |
use ifiq_input return values to apply backpressure to rings.
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
#
1.107 |
|
16-Apr-2019 |
dlg |
i2c reads are more reliable a byte at a time.
reading all 256 at a time was a nice idea, but meant page 0xa2 wasnt appearing like it should. this follows what freebsd does more closely too.
|
#
1.106 |
|
16-Apr-2019 |
dlg |
make sff page reads work on little endian archs too. like amd64.
some modules seem to need more time when waiting for bytes while here.
hrvoje popovski hit the endian issue
|
#
1.105 |
|
15-Apr-2019 |
dlg |
implement SIOCGIFSFFPAGE so ifconfig can get transceiver info.
myx doesn't allow i2c writes, so you can only read whatever page the firmware is already pointing at on device 0xa0. if you try to read another page it will return ENXIO.
tested on a 10G-PCIE-8A-R with an xfp module.
|
#
1.104 |
|
15-Apr-2019 |
dlg |
trim some debug code that printed out the name of a command
the list of commands is going to grow, but the thought of keeping the list in debug code up to date with it just makes me feel tired.
this prints the command id number instead in the same format we represent it in the header.
|
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.103 |
|
01-Aug-2017 |
dlg |
defer init of the myxmcl pool to mountroot, and enable pool cpu caches.
pool_cache_init cannot be called during autoconf because we cant be confident about the number of cpus in the machine until the first run of attaches.
mountroot is after autoconf, and myx already has code that runs there for the firmware loading.
discussed with deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.102 |
|
07-Feb-2017 |
dlg |
move the mbuf pools to m_pool_init and a single global memory limit
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits.
ok bluhm@ as part of a larger diff
|
#
1.101 |
|
24-Jan-2017 |
dlg |
add support for multiple transmit ifqueues per network interface.
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues().
the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set.
enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled.
getting this in now so everyone can kick the tyres.
ok mpi@ visa@ (who provided some tweaks for cnmac).
|
#
1.100 |
|
22-Jan-2017 |
dlg |
move counting if_opackets next to counting if_obytes in if_enqueue.
this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
#
1.99 |
|
31-Oct-2016 |
dlg |
turns out these chips can handle buffers up to 9400 bytes in length.
raise the mtu to 9380 bytes so we can take advantage of the extra space.
i need to revisit the macro names at some point.
|
#
1.98 |
|
31-Oct-2016 |
dlg |
revert 1.97 where i moved myx to using the system pools
my early revision board doesnt like it at all
|
#
1.97 |
|
28-Oct-2016 |
dlg |
get rid of the custom pool in myx for jumbo frames.
now it asks the mbuf layer for the 9k from its pools.
a question from chris@ made me go look at the chip doco again and i realised that the chip only requires 4 byte alignment for rx buffers, no 4k alignment for jumbo buffers.
i also found that the chip is supposed to be able to rx up to 9400 bytes instead of 9000. ill fix that later though.
|
#
1.96 |
|
15-Sep-2016 |
dlg |
all pools have their ipl set via pool_setipl, so fold it into pool_init.
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.95 |
|
23-May-2016 |
tedu |
remove the function pointer from mbufs. this memory is shared with data via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
|
#
1.94 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
#
1.93 |
|
13-Apr-2016 |
mpi |
G/C IFQ_SET_READY().
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.92 |
|
11-Dec-2015 |
mpi |
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API similar to config_defer(9).
ok mikeb@, deraadt@
|
#
1.91 |
|
09-Dec-2015 |
dlg |
rework the if_start mpsafe serialisation so it can serialise arbitrary work
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine.
this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again.
by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race.
tested on various nics ok mpi@
|
#
1.90 |
|
03-Dec-2015 |
dlg |
tell the stack myx_start is mpsafe.
as per the stack commit, the driver changes are:
1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
|
#
1.89 |
|
01-Dec-2015 |
dlg |
myx doesnt use atomic.h anymore.
|
#
1.88 |
|
25-Nov-2015 |
dlg |
replace IFF_OACTIVE manipulation with mpsafe operations.
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
#
1.87 |
|
24-Nov-2015 |
dlg |
fix tx ring accounting in myx_start.
turns out i was calculating the number of packets (not descriptors) on the tx ring, and then using that as the free space for descriptors.
|
#
1.86 |
|
19-Nov-2015 |
dlg |
get rid of sc_tx_free and the atomic ops on it in myx_start and myx_txeof.
myx_start calculates the free space by reading the consumer index and doing some maths, which lets us avoid the interlocked cpu ops.
|
#
1.85 |
|
25-Oct-2015 |
mpi |
arp_ifinit() is no longer needed.
|
#
1.84 |
|
29-Sep-2015 |
dlg |
get rid of the mutex between access to the status block and myx_down
myx is unusual in that it has an explicit command to shut down the chip that gets an interrupt when it's done. so myx_down sends the command and has to sleep until it gets that interrupt. this moves to using a single int to represent that state (so loads and stores are atomic), and sleep_setup/sleep_finish in myx_down to wait for it to change.
this has been running in production at work for a few months now tested by chris@
|
#
1.83 |
|
01-Sep-2015 |
deraadt |
free() firmware with right len; ok dlg
|
#
1.82 |
|
15-Aug-2015 |
dlg |
do the global tx free accounting in myx_start with a single atomic op instead of one per packet.
seems to let me send packets a little faster.
|
#
1.81 |
|
15-Aug-2015 |
dlg |
rework the tx path to use a ring to keep track of dmamaps/mbufs.
this removes the myx_buf structure and uses myx_slot instead. theyre the same expcet slots dont have list entry structures, so theyre smaller.
this cuts out four mutex ops per packet out of the tx handling. just have to get rid of the atomic op per packet in myx_start now.
|
#
1.80 |
|
14-Aug-2015 |
dlg |
move to a per rx ring timeout for refilling empty rings.
this lets me get rid of the locking around the refilling of the rx ring.
the timeout only runs refill if the rx ring is empty. we know rxeof wont try and refill it in that situation because there's no packets on the ring so we wont get interrupts for it. therefore we dont need to lock between the timeout and rxeof cos they cant run at the same time.
|
#
1.79 |
|
14-Aug-2015 |
dlg |
rework how we track the packets on the rx rings.
originally there were two mutex protected lists for rx packets, a list of free packets, and a list of packets that were on the ring. filling the ring popped packets off the free list, attached an mbuf and dmamapped it, and pushed it onto the list of active packets. the hw fills packets in order, so on rx completion we'd pop packets the active list, unmap the mbuf and shove it up the stack before putting the packet on the free list.
the problem with the lists is that every rx ring operation resulted in two mutex ops. so 4 mutex ops per packet after you do both fill and rxeof.
this replaces the mutexed lists with rings that shadow the hardware rings. filling the rx ring pushes a producer index along, while rxeof chases it with a consumer. because we know only one thing can do either of those tasks at a time, we can get away with not using atomic ops for them.
there's more to be done, but this is a good first step.
|
Revision tags: OPENBSD_5_8_BASE
|
#
1.78 |
|
24-Jun-2015 |
mpi |
Increment if_ipackets in if_input().
Note that pseudo-drivers not using if_input() are not affected by this conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
#
1.77 |
|
17-May-2015 |
chris |
We don't need KERNEL_LOCK() around if_input() anymore, as if_input() has appropriate locking around bpf now.
ok dlg@
|
#
1.76 |
|
14-Mar-2015 |
jsg |
Remove some includes include-what-you-use claims don't have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels.
ok tedu@ deraadt@
|
Revision tags: OPENBSD_5_7_BASE
|
#
1.75 |
|
20-Feb-2015 |
chris |
Now that if_input() is a thing, use it
ok dlg@
|
#
1.74 |
|
18-Feb-2015 |
dlg |
myri employees and their drivers for linux and solaris have repeatedly told me that if you're going to rx into buffers greater than 4k in size, they have to be aligned to a 4k boundary.
the mru of this chip is 9k, but ive been using the 12k mcl pool to provide the alignment. however, if we move to putting 8 items on a pool page there'll be enough slack space in the mcl12k pool pages to allow item colouring, which in turn will break the chip requirement above. in practice the chips i have seem to work fine with unaligned buffers, but i dont want to risk breaking early revision chips.
this moves myx to using a private pool for allocating clusters for the big rx ring. the item size is 9k, but we specify a 4k alignment so every item we get out of it will be correct for the chip.
|
#
1.73 |
|
18-Feb-2015 |
dlg |
enable pcie relaxed transaction ordering and bump the max payload size up to 4k.
found while reading someone elses driver.
|
#
1.72 |
|
22-Dec-2014 |
tedu |
unifdef INET
|
#
1.71 |
|
28-Oct-2014 |
dlg |
the if_rxring accounting would get screwed up if the first mbuf to be put on the ring couldnt be allocated.
this pulls the code that puts the mbufs on the ring out of myx_rx_fill so it can return early if firstmb cant be allocated, which puts it in the right place to return unused slots to the if_rxring.
this means myx rx wont lock up if you're DoSsed to the point where you exhaust your mbuf pools and cant allocate mbufs for the ring.
ok jmatthew@
|
#
1.70 |
|
04-Oct-2014 |
dlg |
replace mutexes to serialise the operations on the flag that restricts the number of contexts that are refilling the rx rings with atomic ops.
this is borrowed from code i wrote for the scsi midlayer but cant put in yet because i havent got atomic.h up to scrach on all archs yet. the archs myx runs on do have enough atomic.h to be fine though.
|
#
1.69 |
|
03-Oct-2014 |
dlg |
refill the rx ring in myx_rxeof, not much later at the end of myx_intr.
|
#
1.68 |
|
03-Oct-2014 |
dlg |
in rxeof, instead of taking the biglock on every packet to call bpf and ether_input, queue all the mbufs onto an mbuf_list on the stack and then take the biglock once outside the loop.
|
#
1.67 |
|
03-Oct-2014 |
dlg |
we dont need the kernel lock to call bus_dmamap_load and unload thanks to ketenis.
move the if_ipacket and if_opacket increments out of biglock too. theyre only updated from the interrupt handler, which is only run on a single cpu so there's no chance of the update racing. everywhere else only reads them.
|
#
1.66 |
|
03-Oct-2014 |
dlg |
dont need to hold the kernel lock to call MCLGETI and m_freem now.
|
#
1.65 |
|
03-Oct-2014 |
dlg |
dont take the kernel lock on every interrupt in case we might change the link state or to clear OACTIVE, just take it when we know we really are going to do those things.
|
#
1.64 |
|
14-Sep-2014 |
jsg |
remove uneeded proc.h includes ok mpi@ kspillner@
|
#
1.63 |
|
19-Aug-2014 |
dlg |
in myx_start, replace
while (space) { IFQ_POLL; myx_dequeue(free descr); IFQ_DEQUEUE; etc; }
with
while (space && myx_dequeue(free descr)) { IFQ_DEQUEUE; etc; }
|
#
1.62 |
|
18-Aug-2014 |
dlg |
dont rely on mbuf.h to provide pool.h.
ok miod@, who has offerred to help with any MD fallout ok guenther@
|
Revision tags: OPENBSD_5_6_BASE
|
#
1.61 |
|
12-Jul-2014 |
tedu |
add a size argument to free. will be used soon, but for now default to 0. after discussions with beck deraadt kettenis.
|
#
1.60 |
|
10-Jul-2014 |
dlg |
rings that dont rx packets dont need to be refilled.
|
#
1.59 |
|
08-Jul-2014 |
dlg |
cut things that relied on mclgeti for rx ring accounting/restriction over to using if_rxr.
cut the reporting systat did over to the rxr ioctl.
tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
|
#
1.58 |
|
17-Jun-2014 |
dlg |
whitespace fix.
im sick of fixing this by hand on all my boxes while hacking on other stuff and having it pollute my diffs.
no functional change.
|
#
1.57 |
|
24-Mar-2014 |
dlg |
nothing after the irq ack posting relies on it being ordered.
|
Revision tags: OPENBSD_5_5_BASE
|
#
1.56 |
|
10-Feb-2014 |
dlg |
the mac addresses you program with MYXCMD_SET_MCASTGROUP are in a different format to the one used for MYXCMD_SET_LLADDR. for reasons.
this lets ospf work if you dont happen to have PROMISC enabled on your interface like my production firewalls happen to have, which is why i never noticed this before.
|
#
1.55 |
|
05-Feb-2014 |
dlg |
after running myx(4) without biglock in production for a few days i discovered that there's a race between the interrupt code and myx_start which causes the count of free tx descriptors to get distorted, which eventually leads to a permanent setting of IFF_OACTIVE, which in turn prevents the driver from transmitting packets.
fixing that went horribly wrong when i then discovered that there's a race between the interrupt handler and myx_down, where the interrupt can tell myx_down to wake up and free all the rings while the interrupt handler is still looking at them. free panics for all.
this moves the handling of the tx free count under the biglock (for now), and moves myx_up and myx_down to managing a "driver state" variable independantly of the IFF_UP and IFF_RUNNING flags, and very very careful reordering of the checks of that state variable and the hardware state.
as a bonus we get to avoid excessive calls to myx_txeof and myx_rxeof in the isr, and less stuff checked unconditionally. on the other hand, the sc_state handling added some more checks so it might not be a win overall.
tested on smp sparc64 with msi and nonmsi interrupts, and on amd64 smp in production again.
|
#
1.54 |
|
31-Jan-2014 |
dlg |
sc_function is set, but never used for anything useful. clean it up...
|
#
1.53 |
|
31-Jan-2014 |
dlg |
sc_lladdr is never used, so we can get the space in the sc back.
|
#
1.52 |
|
23-Jan-2014 |
dlg |
a lot of people have pointed out to me that taking a lock just to read an int isnt necessary.
|
#
1.51 |
|
23-Jan-2014 |
dlg |
factor the mutex/bus_space handling of the sts block out.
|
#
1.50 |
|
21-Jan-2014 |
dlg |
introduce fine grained locking.
this doesnt give up the big lock coming from process context, only from the interrupt side. it is excessively careful about when it takes the big lock again. notably it goes to a lot of effort to not hold a mutex while calling into other subsystems or before taking the big lock.
ive been hitting it as hard as i can without problems.
intensly read by mpi@ ok claudio@ kettenis@
|
#
1.49 |
|
19-Jan-2014 |
dlg |
white space fix
|
#
1.48 |
|
19-Jan-2014 |
dlg |
introduce fine grained locking around the lists of packet handlers myx maintains. this moves it away from relying on splnet to protect them.
|
#
1.47 |
|
19-Jan-2014 |
dlg |
hwflags is never used, so clean it up
|
#
1.46 |
|
19-Jan-2014 |
dlg |
replace bcmp with memcmp
|
#
1.45 |
|
19-Jan-2014 |
dlg |
bcopy to memcpy
|
#
1.44 |
|
19-Jan-2014 |
dlg |
replace bzero with memset.
|
#
1.43 |
|
19-Jan-2014 |
dlg |
all 64bit archs myx runs on support bus_space 8 things because of work i did at n2k13.
|
Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
|
#
1.42 |
|
29-Jan-2013 |
brad |
- Set ENETRESET within myx_ioctl() instead of calling myx_iff() directly, to be consistent with other drivers. - Clear IFF_ALLMULTI flag early and at the top of myx_iff(). - Set IFF_ALLMULTI when in promisc mode.
ok dlg@
|
#
1.41 |
|
25-Jan-2013 |
dlg |
we go to a lot of effort to post the first tx descriptor last, but we really should be trying to post everything except the flags field in the first tx descriptor. this shuffles things around so the rest of that first txd is posted as part of the "everything else" before its flags field.
|
#
1.40 |
|
25-Jan-2013 |
dlg |
the myx_dmamem struct doesnt need a name.
|
#
1.39 |
|
21-Jan-2013 |
dlg |
myx does reads and writes in one direction to packet buffers. lets try STREAMING them.
|
#
1.38 |
|
15-Jan-2013 |
dlg |
dont use amd64 is currently broken cos it has no bus_space_write_raw_region_8. disabling it for now.
|
#
1.37 |
|
15-Jan-2013 |
dlg |
use bus_space_write_raw_region_8 on 64bit archs when writing to the rings
|
#
1.36 |
|
14-Jan-2013 |
dlg |
map the registers PREFETCHABLE so things that can do write combining can try and do write combining like the myx doco likes.
|
#
1.35 |
|
14-Jan-2013 |
dlg |
avoid extra bus_space barriers in the interrupt handler.
|
#
1.34 |
|
14-Jan-2013 |
dlg |
when posting descriptors to the chips rings, avoid going write barrier write barrier write barrier when using myx_write to post descriptors.
instead let its go write write write barrier by using the appropriate bus_space write directly followed by a single bus_space barrier.
the story above is mostly true, except that myx wants use to write all the descriptors except the first, barrier, and then write the first one out to signale that the chip can proceed.
it is also worth noting that the barriers cover more address space than what we actually wrote to. this makes the code much simpler, and avoids generating extra fence operations (which is what barrier functions end up as on most of our archs) when we wrap around the end of the ring. the bus_space doco encourages this.
bus_space use was discussed with krw@ kettenis@ deraadt@
|
#
1.33 |
|
14-Jan-2013 |
dlg |
the myri doco suggests its nice to post stuff by filling in everything in the rings except the first descriptor. once you've written as much as you can out, then you go back and post the first descriptor to signal that the chip should go ahead and work.
|
#
1.32 |
|
14-Jan-2013 |
dlg |
;; is a long way of saying ;
|
#
1.31 |
|
29-Nov-2012 |
brad |
Remove setting an initial assumed baudrate upon driver attach which is not necessarily correct, there might not even be a link when attaching.
ok mikeb@ reyk@
|
Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
|
#
1.30 |
|
28-Nov-2011 |
blambert |
Fix reversed error-handling gotos in myx_buf_fill(), which would lead to either an mbuf leak or a NULL pointer dereference.
ok sthen@ claudio@ dlg@ testing claudio@ dlg@
|
Revision tags: OPENBSD_5_0_BASE
|
#
1.29 |
|
08-Aug-2011 |
dlg |
myx requires the driver pad short ethernet frames to 60 bytes by adding a descriptor pointing at zeroed bytes onto the end of transmit chains. i was accounting for this extra descriptor when i was completing the chain, but not when i was setting this up. this meant the number of free descriptors kept growing until it overflowed. at this point the check for space in the ring failed and packets no longer flowed.
this counts the pad descriptor in the tx chain setup too.
ok deraadt@
|
#
1.28 |
|
23-Jun-2011 |
dlg |
cope with empty rx rings by scheduling a timeout to keep trying until it gets some packets onto the rings.
also annoying, but the hardware doesnt report empty rings, we have to handle it ourselves.
|
#
1.27 |
|
23-Jun-2011 |
dlg |
this chip has an annoying "feature" where it cannot report the link state unless the chip is up and handling packets. while its down it does not report the link state, so it is unknown.
this tweaks the link state handling, in particular it adds code to myx_down so it moves the link state to unknown, ie, it correctly reflects reality.
stupidity pointed out by deraadt
|
#
1.26 |
|
22-Jun-2011 |
deraadt |
reset the tx_count on UP, since it may have been advanced from non-zero by a previous use ok claudio
|
#
1.25 |
|
22-Jun-2011 |
dlg |
msi support. this is a complicated one...
ok kettenis@
|
#
1.24 |
|
22-Jun-2011 |
jsg |
another myri10ge device matched by freebsd/linux drivers ok dlg@
|
#
1.23 |
|
22-Jun-2011 |
dlg |
oops, handle refill like i said i was going to two revisions ago.
|
#
1.22 |
|
22-Jun-2011 |
deraadt |
set the mac address on the chip correctly (repair the byte order) it now works on sparc64, too ok dlg
|
#
1.21 |
|
22-Jun-2011 |
dlg |
deraadt plugged his myx into a sparc64 and discovered 3 problems:
1. we want to write raw values to registers all the time, so promote the myx_raw{read,write} to myx_{read,write} and use them everywhere. get rid of the raw funcs. 2. i was setting the watermarks on the rx ring before knowhing how big they were. 3. rxfill in the interrupt handler could lose data if you loop on sts_isvalid.
almost working now...
"please commit your diff" deraadt@
|
#
1.20 |
|
21-Jun-2011 |
dlg |
do the unaligned dma tests so we can figure out if we need to fall back to the unaligned firmware. apparently this is only an issue on the "A" controllers which have been supersceded, but those are the chips we (openbsd devs) have.
|
#
1.19 |
|
21-Jun-2011 |
dlg |
report the controllers part number. eg, i now know i have a 10G-PCIE-8A-R. dmesg looks like this:
myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x00: apic 1 int 8, model 10G-PCIE-8A-R, address 00:60:dd:47:c6:74
|
#
1.18 |
|
21-Jun-2011 |
dlg |
wire up jumbos properly. the hardware supports up to 9018 bytes off the wire (9000 + ether header + vlan tag), but has some cool alignment requirements. if you want to use a single rx ring desc to point at a jumbo it needs to start on a 4k boundary and be physically contiguous. to ensure this im pulling frames from the 12k pool and waiting for arianes diff to ensure mbufs are contig.
direction from andrew gallatin. tested locally.
|
#
1.17 |
|
21-Jun-2011 |
deraadt |
minor cleanups; ok dlg
|
#
1.16 |
|
20-Jun-2011 |
dlg |
make the interrupt handler look more like what the doco suggests. seems to fix a bad lockup i kept getting.
|
#
1.15 |
|
20-Jun-2011 |
dlg |
dont need debug, the myx_cmd stuff works fine.
|
#
1.14 |
|
20-Jun-2011 |
dlg |
i got myx working!
|
#
1.13 |
|
02-May-2011 |
chl |
Do not check malloc return value against NULL, as M_WAITOK is used.
ok dlg@ krw@
|
Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
|
#
1.12 |
|
19-May-2010 |
oga |
BUS_DMA_ZERO instead of alloc, map, bzero.
ok krw@
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.11 |
|
13-Aug-2009 |
jasper |
- consistify cfdriver for the ethernet drivers (0 -> NULL)
ok dlg@
|
Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
|
#
1.10 |
|
28-Nov-2008 |
brad |
Eliminate the redundant bits of code for MTU and multicast handling from the individual drivers now that ether_ioctl() handles this.
Shrinks the i386 kernels by.. RAMDISK - 2176 bytes RAMDISKB - 1504 bytes RAMDISKC - 736 bytes
Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users. Build tested on almost all archs by todd@/brad@
ok naddy@
|
#
1.9 |
|
02-Oct-2008 |
brad |
First step towards cleaning up the Ethernet driver ioctl handling. Move calling ether_ioctl() from the top of the ioctl function, which at the moment does absolutely nothing, to the default switch case. Thus allowing drivers to define their own ioctl handlers and then falling back on ether_ioctl(). The only functional change this results in at the moment is having all Ethernet drivers returning the proper errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown ioctl's.
Shrinks the i386 kernels by.. RAMDISK - 1024 bytes RAMDISKB - 1120 bytes RAMDISKC - 832 bytes
Tested by martin@/jsing@/todd@/brad@ Build tested on almost all archs by todd@/brad@
ok jsing@
|
#
1.8 |
|
10-Sep-2008 |
blambert |
Convert timeout_add() calls using multiples of hz to timeout_add_sec()
Really just the low-hanging fruit of (hopefully) forthcoming timeout conversions.
ok art@, krw@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.7 |
|
23-May-2008 |
brad |
Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as suggested by dlg@ awhile ago.
ok dlg@
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.6 |
|
16-Jan-2008 |
thib |
Set the baudrate with IF_Gbps(10); and remove an XXX comment now that if_baudrate is 64bits.
ok reyk@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.5 |
|
01-Jun-2007 |
reyk |
initialize the rings
|
#
1.4 |
|
31-May-2007 |
reyk |
further improvement of the bus space i/o. firmware loading, booting, and card initalization works now.
thanks to dlg@ who pointed me to the fact that bus_space_write_region_N and bus_space_write_raw_region_N use count of elements vs. size of buffer arguments.
|
#
1.3 |
|
31-May-2007 |
reyk |
enable all debugging messages by default if the driver is compiled with MYX_DEBUG
|
#
1.2 |
|
31-May-2007 |
reyk |
fix the myx_write function
|
#
1.1 |
|
31-May-2007 |
reyk |
initial bits of a new driver for the Myricom Myri-10G Lanai-Z8E 10Gb Ethernet chipset. not working yet.
ok dlg@
|