#
4e2688cc |
|
30-Oct-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Update driver version to v2.7.0 Features: * Introduce customer and SRD metrics through sysctl * Introduce spreading IRQs to CPUs capability using sysctl * Upgrade ena-com to v2.7.0 Bug Fixes: * Remove outdated APIs Minor Changes: * Introduce a shared stats sample interval for all stats Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
246aa273 |
|
23-Oct-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Update the license dating to 2023 Some of the files are using outdated linceses. Update the license to be 2023. Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
36d42c86 |
|
12-Sep-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Support srd metrics with sysctl This commit introduces SRD metrics through sysctl. The metrics can be queried using the following sysctl node: sysctl dev.ena.<device index>.ena_srd_info Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
f97993ad |
|
12-Sep-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Support customer metric with sysctl This commit adds sysctl support for customer metrics. Different customer metrics can be found in the following sysctl node: sysctl dev.ena.<device index>.customer_metrics Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
5b925280 |
|
12-Sep-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Introduce shared sample interval for all stats Rename sample_interval node to stats_sample_interval and move it up in the sysctl tree to make it clear that it's relevant for all the stats and not only ENI metrics (Currently, sample interval node is found under eni_metrics node). Path to node: dev.ena.<device_index>.stats_sample_interval Once this parameter is set it will set the sample interval for all the stats node including SRD/customer metrics. Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
f9e1d947 |
|
30-Oct-2023 |
Osama Abboud <osamaabb@amazon.com> |
ena: Add sysctl support for spreading IRQs This commit allows spreading IO IRQs over different CPUs through sysctl. Two sysctl nodes are introduced: 1- base_cpu: servers as the first CPU to which the first IO IRQ will be bound. 2- cpu_stride: sets the distance between every two CPUs to which every two consecutive IO IRQs are bound. For example for doing the following IO IRQs / CPU binding: IRQ idx | CPU ---------------- 1 | 0 2 | 2 3 | 4 4 | 6 Run the following commands: sysctl dev.ena.<device index>.irq_affinity.base_cpu=0 sysctl dev.ena.<device_index>.irq_affinity.cpu_stride=2 Also introduced rss_enabled field, which is intended to replace '#ifdef RSS' in multiple places, in order to prevent code duplication. We want to bind interrupts to CPUs in case of rss set OR in case the newly defined sysctl paremeter is set. This requires to remove a couple of '#ifdef RSS' as well in the structs, since we'll be using the relevant parameters in the CPU binding code. Approved by: cperciva (mentor) MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
ac40021c |
|
28-May-2023 |
Arthur Kiyanovski <akiyano@amazon.com> |
ena: Update driver version to v2.6.3 Bug Fixes: * Initialize statistics before the interface is available * Fix driver unload crash Minor Changes: * Mechanically convert ena(4) to DrvAPI * Remove usage of IFF_KNOWSEPOCH MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
e5de1d8d |
|
13-Dec-2022 |
Arthur Kiyanovski <akiyano@amazon.com> |
ena: Update driver version to v2.6.2 Bug Fixes: * Remove timer service re-arm on ena_restore_device failure. * Re-Enable per-packet missing tx completion print Minor Changes: * Switch driver owners from Semihalf to Amazon in man file. MFC after: 2 weeks Sponsored by: Amazon, Inc. Pull Request: https://github.com/freebsd/freebsd-src/pull/637
|
#
25b64933 |
|
04-Jul-2022 |
Michal Krawczyk <mk@semihalf.com> |
ena: Update driver version to v2.6.1 Minor version update which improves styling of a printouts, fixes the KASAN and KMSAN kernel builds and LLQ reconfiguration after the device reset. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
b72f1f45 |
|
30-Jun-2022 |
Mark Johnston <markj@FreeBSD.org> |
ena: Make first_interrupt a uint8_t We do not have atomic(9) routines for bools, and it is not guaranteed that sizeof(bool) is 1. This fixes the KASAN and KMSAN kernel builds, which fail because the compiler refuses to silently cast a _Bool * to a uint8_t * when calling the atomic(9) sanitizer interceptors. Reviewed by: Dawid Górecki <dgr@semihalf.com> MFC after: 2 weeks Fixes: 0ac122c388d9 ("ena: Use atomic_load/store functions for first_interrupt variable") Differential Revision: https://reviews.freebsd.org/D35683
|
#
79e15002 |
|
10-Jun-2022 |
Michal Krawczyk <mk@semihalf.com> |
ena: Update driver version to v2.6.0 Some of the changes in this release: * Style fixes * Fix ENI stats probing * Add trace for the last Tx cleanup call * Prevent LLQ initialization if member isn't exposed * Improve logging Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
8f15f8a7 |
|
10-Jun-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: Align names of constants Most of the constants in ena.h file were prefixed with ENA_*, while others did not have this prefix. Align the constants by prefixing the remaining constants with ENA. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
82e558ea |
|
10-Jun-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: Fix styling issues Align code style with FreeBSD style(9) guidelines. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
b899a02a |
|
10-Jun-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: Move ena_copy_eni_metrics into separate task Copying ENI metrics was done in callout context, this caused the driver to panic when sample_interval was set to a value other than 0, as the admin queue call which was executed could sleep while waiting on a condition variable. Taskqueue, unlike callout, allows for sleeping, so moving the function to a separate taskqueue fixes the problem. ena_timer_service is still responsible for scheduling the taskqueue. Stop draining the callout during ena_up/ena_down. This was done to prevent a race between ena_up/down and ena_copy_eni_metrics admin queue calls. Since ena_metrics_task is protected by ENA_LOCK there is no possibility of a race between ena_up/down and ena_metrics_task. Remove a comment about locking in ena_timer_service. With ENI metrics in a separate task this comment became obsolete. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
d8aba82b |
|
10-Jun-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: Store ticks of last Tx cleanup Store timestamp of last cleanup in Tx ring structure. This does not change anything during normal operation of the driver but could be useful when the device fails for some reason. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
3501d4f1 |
|
10-Jun-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: Add ena_ring_tx_doorbell() function Add ena_ring_tx_doorbell function to remove code duplication. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
8a5b4859 |
|
03-Jan-2022 |
Michal Krawczyk <mk@semihalf.com> |
ena: update ENA version to v2.5.0 Some of the changes in this release: - IPv6 L4 checksum offload fixes. - Optimization of the Tx req_id validation. - Timer service adjustments. - NUMA awareness for the kernel RSS mode. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
78554d0c |
|
03-Jan-2022 |
Dawid Gorecki <dgr@semihalf.com> |
ena: start timer service on attach The timer service was started when the interface was brought up and it was stopped when it was brought down. Since ena_up requires the device to be responsive, triggering the reset would become impossible if the device became unresponsive with the interface down. Since most of the functions in timer service already perform the check to see if the device is running, this only requires starting the callout in attach and stopping it when bringing the interface up or down to avoid race between different admin queue calls. Since callout functions for timer service are always called with the same arguments, replace callout_{init,reset,drain} calls with ENA_TIMER_{INIT,RESET,DRAIN} macros. Submitted by: Dawid Gorecki <dgr@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
42c7760b |
|
12-Aug-2021 |
Michal Krawczyk <mk@semihalf.com> |
ena: Update driver version to v2.4.1 Some of the changes in this release: * Hardware RSS hash key reconfiguration and indirection table reconfiguration support. * Full kernel RSS support. * Extra statistic counters. * Netmap support for ENAv3. * Locking assertions. * Extra log messages. * Reset handling fixes. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
6d1ef2ab |
|
12-Aug-2021 |
Artur Rojek <ar@semihalf.com> |
ena: Implement full RSS reconfiguration Bind RX/TX queues and MSI-X vectors to matching CPUs based on the RSS bucket entries. Introduce sysctls for the following RSS functionality: - rss.indir_table: indirection table mapping - rss.indir_table_size: indirection table size - rss.key: RSS hash key (if Toeplitz used) Said sysctls are only available when compiled without `option RSS`, as kernel-side RSS support currently doesn't offer RSS reconfiguration. Migrate the hash algorithm from CRC32 to Toeplitz and change the initial hash value to 0x0 in order to match the standard Toeplitz implementation. Provide helpers for hash key inversion required for HW operations. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
223c8cb1 |
|
12-Aug-2021 |
Artur Rojek <ar@semihalf.com> |
ena: Add missing statistics Provide the following sysctl statistics in order to stay aligned with the Linux driver: * rx_ring.csum_good * tx_ring.unmask_interrupt_num Also rename the 'bad_csum' statistic name to 'csum_bad' for alignment. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
07aff471 |
|
12-Aug-2021 |
Artur Rojek <ar@semihalf.com> |
ena: Share ena_global_lock between driver instances In order to use `ena_global_lock` in sysctl context, it must be kept outside the driver instance's software context, as sysctls can be called before attach and after detach, leading to lock use before sx_init and after sx_destroy otherwise. Solve this issue by turning `ena_global_lock` into a file scope variable, shared between all instances of the driver and associated sysctl context, and in turn initialized/destroyed in dedicated SYSINIT/SYSUNINIT functions. As a side effect, this change also fixes existing race in the reset routine, when simultaneously accessing sysctl exposed properties. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
986e7b92 |
|
12-Aug-2021 |
Artur Rojek <ar@semihalf.com> |
ena: Move RSS logic into its own source files Delegate RSS related functionality into separate .c/.h files in preparation for the full RSS support. While at it, reorder functions and remove prototypes for ones with internal linkage. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
cb98c439 |
|
12-Aug-2021 |
Artur Rojek <ar@semihalf.com> |
ena: Add locking assertions ENA silently assumed that ena_up, ena_down and ena_start_xmit routines should be called within locked context. Driver's logic heavily assumes on concurrent access to those routines, so for safety and better documentation about this assumption, the locking assertions were added to the above functions. The assertion was added only for the main steps (skipping the helper functions) which can be called from multiple places including the kernel and the driver itself. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
93f0df45 |
|
24-Jun-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
Update ENA version to v2.4.0 Some of the changes in this release: * Large LLQ headers, * Bug/stability fixes, * Change of the README/Documentation. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
0e7d31f6 |
|
14-Jun-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
ena: hide sysctl nodes for unused ENA queues IO queue related attributes are registered statically at driver attach with the rest of the ENA specific sysctl nodes. However, the number of queues can be changed at runtime via the `ena_sysctl_io_queues_nb` request, leading to a potential exposure of attributes for non-existing queues. Introduce a new `ena_sysctl_update_queue_node_nb` function, which updates the sysctl nodes after the number of queues is altered. This happens by either registering or unregistering node specific oids, based on a delta between the previous and current queue count. NOTE: All unregistered oids must be registered again before the driver detach, e.g. by another call to this function. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
1c808fcd |
|
18-Feb-2021 |
Michal Krawczyk <mk@semihalf.com> |
Allocate BAR for ENA MSIx vector table In the new ENA-based instances like c6gn, the vector table moved to a new PCIe bar - BAR1. Previously it was always located on the BAR0, so the resources were already allocated together with the registers. As the FreeBSD isn't doing any resource allocation behind the scenes, the driver is responsible to allocate them explicitly, before other parts of the OS (like the PCI code allocating MSIx) will be able to access them. To determine dynamically BAR on which the MSIx vector table is present the pci_msix_table_bar() is being used and the new BAR is allocated if needed. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc MFC after: 3 days
|
#
7dee315e |
|
18-Nov-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Update ENA driver version to v2.3.0 The v2.3.0 introduces new ena_com layer, ENI metrics updates and SPDX license tags. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27120
|
#
7d2e6f20 |
|
18-Nov-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Rename descriptions of the supported ENA devices Some of the PCI ID were described as ENA with LLQ support - it's not fully accurate and because of that, their names were changed. Instead of LLQ, use RSERV0 for the description of those devices. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27119
|
#
f180142c |
|
18-Nov-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Add ENI metrics for the ENA driver The new HAL allows the driver to read extra ENI stats. Exact meaning of each of them can be found in base/ena_defs/ena_admin_defs.h file and structure ena_admin_eni_stats. Those stats are being updated inside of the timer service, which is executed every second. ENI metrics are turned off by default. They can be enabled, using the sysctl node: dev.ena.X.eni_metrics.update_delay 0 value in this node means that the update is turned off. Other values determine how many seconds must pass, before ENI metrics will be updated. They can be acquired, using sysctl: sysctl dev.ena.X.eni_metrics Where X stands for the interface number. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27118
|
#
0835cc78 |
|
18-Nov-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Add SPDX license tag to the ENA driver files Refering to guide: https://wiki.freebsd.org/SPDX the SPDX tag should not replace the standard license text, however it should be added over the standard license text to make the automation easier. Because of that, the old license was kept, but the SPDX tag was added on top of every ENA driver file. Submited by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27117
|
#
2287afd8 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Update ENA driver version to v2.2.0 Driver version upgrade is connected with support for the new device fetures, like Tx drops reporting or disabling meta caching. Moreover, the driver configuration from the sysctl was reworked to provide safer and better flow for configuring: * number of IO queues (new feature), * drbr size on Tx, * Rx queue size. Moreover, a lot of minor bug fixes and improvements were added. Copyright date in the license of the modified files in this release was updated to 2020. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
0b432b70 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Allow disabling meta caching for ENA Tx path Determined by a flag passed from the device. No metadata is set within ena_tx_csum when caching is disabled. Submitted by: Maciej Bielski <mba@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
9762a033 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Create ENA IO queues with optional backoff If requested size of IO queues is not supported try to decrease it until finding the highest value that can be satisfied. Submitted by: Maciej Bielski <mba@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
56d41ad5 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Add sysctl node for ENA IO queues number adjustment By default, in ena_attach() the driver attempts to acquire ena_adapter::max_num_io_queues MSI-X vectors for the purpose of IO queues, however this is not guaranteed. The number of vectors acquired depends also on system resources availability. Regardless of that, enable the number of effectively used IO queues to be further limited through the sysctl node. Example: Assumming that there are 8 IO queues configured by default, the command $ sysctl dev.ena.0.io_queues_nb=4 will reduce the number of available IO queues to 4. Similarly, the value can be also increased up to maximum supported value. A value higher than maximum supported number of IO queues is ignored. Zero is ignored too. Submitted by: Maciej Bielski <mba@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
21823546 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Rework ENA Tx buffer ring size reconfiguration This method has been aligned with the way how the Rx queue size is being updated - so it's now done synchronously instead of resetting the device. Moreover, the input parameter is now being validated if it's a power of 2. Without this, it can cause kernel panic. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
7d8c4fee |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Rework ENA Rx queue size configuration This patch reworks how the Rx queue size is being reconfigured and how the information from the device is being processed. Reconfiguration of the queues and reset of the device in order to make the changes alive isn't the best approach. It can be done synchronously and it will let to pass information if the reconfiguration was successful to the user. It now is done in the ena_update_queue_size() function. To avoid reallocation of the ring buffer, statistic counters and the reinitialization of the mutexes when only new size has to be assigned, the io queues initialization function has been split into 2 stages: basic, which is just copying appropriate fields and the advanced, which allocates and inits more advanced structures for the IO rings. Moreover, now the max allowed Rx and Tx ring size is being kept statically in the adapter and the size of the variables holding those values has been changed to uint32_t everywhere. Information about IO queues size is now being logged in the up routine instead of the attach. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
02a2a7ce |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Expose argument names for non static ENA driver functions As functions which are declared in the header files are intended to be the interface and are going to be used by other files, it's better to include argument names in the definition, so the caller won't have to check the .c file in order to check their meaning and order. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
6959869e |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Use single global lock in the ENA driver Currently, the driver had 2 global locks - one was sx lock used for up/down synchronization and the second one was mutex, which was used for link configuration and timer service callout. It is better to have single lock for that. We cannot use mutex, as it can sleep and cause witness errors in up/down configuration, so sx lock seems to be the only choice. Callout cannot use sx lock, but the timer service is MP safe, so we just need to avoid race between ena_down() and ena_detach(). It can be avoided by acquiring sx lock. Simple macros were added that are encapsulating implementation of the lock and makes the code cleaner. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
7926bc44 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Add trigger reset function in the ENA driver As the reset triggering is no longer a simple macro that was just setting appropriate flag, the new function for triggering reset was added. It improves code readability a lot, as we are avoiding additional indentation. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
6c84cec3 |
|
26-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Enable Tx drops reporting in the ENA driver Tx drops statistics are fetched from HW every ena_keepalive_wd() call and are observable using one of the commands: * sysctl dev.ena.0.hw_stats.tx_drops * netstat -I ena0 -d Submitted by: Maciej Bielski <mba@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
04cf2b88 |
|
07-May-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Optimize ENA Rx refill for low memory conditions Sometimes, especially when there is not much memory in the system left, allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time and it is not guaranteed that it'll succeed. In that situation, the fallback will work, but if the refill needs to take a place for a lot of descriptors at once, the time spent in m_getjcl looking for memory can cause system unresponsiveness due to high priority of the Rx task. This can also lead to driver reset, because Tx cleanup routine is being blocked and timer service could detect that Tx packets aren't cleaned up. The reset routine can further create another unresponsiveness - Rx rings are being refilled there, so m_getjcl will again burn the CPU. This was causing NVMe driver timeouts and resets, because network driver is having higher priority. Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are enough - ENA MTU is being set to 9K anyway, so it's very unlikely that more space than 9KB will be needed. However, 9KB jumbo clusters can still cause issues, so by default the page size mbuf cluster will be used for the Rx descriptors. This can have a small (~2%) impact on the throughput of the device, so to restore original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to "1" in "/boot/loader.conf" file. As a part of this patch (important fix), the version of the driver was updated to v2.1.2. Submitted by: cperciva Reviewed by: Michal Krawczyk <mk@semihalf.com> Reviewed by: Ido Segev <idose@amazon.com> Reviewed by: Guy Tzalik <gtzalik@amazon.com> MFC after: 3 days PR: 225791, 234838, 235856, 236989, 243531 Differential Revision: https://reviews.freebsd.org/D24546
|
#
888810f0 |
|
24-Feb-2020 |
Marcin Wojtas <mw@FreeBSD.org> |
Rework and simplify Tx DMA mapping in ENA Driver working in LLQ mode in some cases can send only few last segments of the mbuf using DMA engine, and the rest of them are sent to the device using direct PCI transaction. To map the only necessary data, two DMA maps were used. That solution was very rough and was causing a bug - if both maps were used (head_map and seg_map), there was a race in between two flows on two queues and the device was receiving corrupted data which could be further received on the other host if the Tx cksum offload was enabled. As it's ok to map whole mbuf and then send to the device only needed segments, the design was simplified to use only single DMA map. The driver version was updated to v2.1.1 as it's important bug fix. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc.
|
#
1c028d72 |
|
01-Nov-2019 |
Warner Losh <imp@FreeBSD.org> |
Make valdiate_rx_req_id static inline because it uses other static inline functions. gcc complains about this, most likely due to the subtle differences between inline and static inline functions defined in headers.
|
#
2731abe8 |
|
31-Oct-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Update ENA version to v2.1.0 In this release the netmap support was introduced. Moreover, it is also now possible to use the LLQ mode of the driver on the arm64 AWS instances (A1 type). Differential Revision: https://reviews.freebsd.org/D21938 Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
6f2128c7 |
|
31-Oct-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Add support for ENA NETMAP Tx Two new tables are added to ena_tx_buffer structure: * netmap_map_seg stores DMA mapping structures, * netmap_buf_idx stores buff indexes taken from the slots. When Tx resources are being set, the new mapping structures are created and netmap Tx rings are being reset. When Tx resources are being released, used netmap bufs are unmapped from DMA and then mapping structures are destroyed. When Tx interrupt occurrs, ena_netmap_tx_irq is called. ena_netmap_txsync callback signalizes that there are new packets which should be transmitted. First, it fills ena_netmap_ctx. Then it performs two actions: * ena_netmap_tx_frames moves packets from netmap ring to NIC, * ena_netmap_tx_cleanup restores buffers from NIC and gives them back to the userspace app. 0 is returned in case of Tx error that could be handled by the driver. ena_netmap_tx_frames checks if there are packets ready for transmission. Then, for each of them, ena_netmap_tx_frame is called. If error occurs, transmitting is stopped, but if the error was cause due to HW ring being full, information about that is not propagated to the userspace app. When all packets are ready, doorbell is written to NIC and netmap ring state is updated. Parsing of one packet is done by the ena_netmap_tx_frame function. First, it checks if number of slots does not exceed NIC limit. Invalid packets are being dropped and the error is propagated to the upper layer. As each netmap buffer has equal size, which is typically greater then 2KiB, there shouldn't be any packets which contain too many slots. Then, the ena_com_tx_ctx structure is being filled. As netmap does not support any hardware offloads, ena_com_tx_meta structure is set to zero. After that, ena_netmap_map_slots maps all memory slots for DMA. If the device works in the LLQ mode, the push header is being determined by checking if the header fits within the first socket. If so, the portion of data is being copied directly from the slot. In other case, the data is copied to the intermediate buffer. First slots are treated the same as as the others, because DMA mapping has no impact on LLQ mode. Index of each netmap buffer is taken from slot and stored in netmap_buf_idx array. In case of mapping error, memory is unmapped and packets are put back to the netmap ring. ena_netmap_tx_cleanup performs out of order cleanup of sent buffers. First, req_id is taken and is validated. As validate_tx_req_id from ena.c is specific to kernels mbuf, another implementation is provided. Each req_id is cleaned up by ena_netmap_tx_clean_one function. Buffers are being unmaped from DMA and put back to netmap ring. In the end, state of netmap and NIC rings are being updated. Differential Revision: https://reviews.freebsd.org/D21936 Submitted by: Rafal Kozik <rk@semihalf.com> Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
9a0f2079 |
|
31-Oct-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Add support for ENA NETMAP Rx Most of code used for Rx ring initialization could be reused in NETMAP. Reset of NETMAP ring and new alloc method was added. Driver decides if use kernels mbufs or NETMAPs slots based on IFCAP_NETMAP flag. It allows to reuse ena_refill_rx_bufs, which provides proper handling of Rx out of order completion. ena_netmap_alloc_rx_slot takes exactly the same arguments as ena_alloc_rx_mbuf, but instead of allocating one mbuf it takes one slot from NETMAP ring. Based on queue id proper netmap_ring is found. As NETMAP provides the "partial opening" feature not all of the rings are avaiable. Not used points to invalid ring. If there is available slot, it is taken from the ring. Its buffer is mapped to DMA and its index is stored in ena_rx_buffer field in ena_rx_buffer structure. Then ena_buf is filled with addresses and ring state is updated. Cleanup is handled by ena_netmap_free_rx_slot. It unmaps DMA and returns buffer to ring. As we could not return more bufs than we have taken and we should not override occupied slots, buf_index should be 0. It is being checked by assertion. ena_netmap_rxsync callback puts received packets back to NETMAP ring and passes them to user space by updating ring pointers. First it fills ena_netmap_ctx. Then it performs two actions: * ena_netmap_rx_frames moves received frames from NIC to NETMAP ring, * ena_netmap_rx_cleanup fills NIC ring with slots released by userspace app. In case of Rx error that could be handled by NIC driver (for example by performing reset) rx sync should return 0. ena_netmap_rx_frames first checks if NETMAP ring is in consistent state and then in the loop receives new frames. When all available frames are taken nr_hwtail is updated. Receiving one frame is handled by ena_netmap_rx_frame. If no error occurrs, each Descriptor is loaded by ena_netmap_rx_load_desc function. If packets take more than one segments NS_MOREFRAG flag must be set in all, but not last slot. In case of wrong req_id packet is removed from NETMAP ring. If packet is successful received counters are updated. Refiling of NIC ring is performed by ena_netmap_rx_cleanup function. It calculates number of available slots and call ena_refill_rx_bufs with proper number. Differential Revision: https://reviews.freebsd.org/D21935 Submitted by: Rafal Kozik <rk@semihalf.com> Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
38c7b965 |
|
31-Oct-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Split Rx/Tx from initialization code in ENA driver Move Rx/Tx routines to separate file. Some functions: * ena_restore_device, * ena_destroy_device, * ena_up, * ena_down, * ena_refill_rx_bufs could be reused in upcoming netmap code in the driver. To make it possible, they were moved to ena.h header. Differential Revision: https://reviews.freebsd.org/D21933 Submitted by: Rafal Kozik <rk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
9d0073e4 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Update ENA version to v2.0.0 ENAv2 introduces many new features, bug fixes and improvements. Main new features are LLQ (Low Latency Queues) and independent queues reconfiguration using sysctl commands. The year in copyright notice was updated to 2019. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
32f63fa7 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Split ENA reset routine into restore and destroy stages For alignment with Linux driver and better handling ena_detach(), the reset is now calling ena_device_restore() and ena_device_destroy(). The ena_device_destroy() is also being called on ena_detach(), so the code will be more readable. The watchdog is now being activated after reset only, if it was active before. There were added additional checks to ensure, that there is no race with the link state change AENQ handler. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
fd43fd2a |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Use bitfield for storing global ENA device states As the ENA can have multiple states turned on/off, it is more convenient to store them in single bitfield instead of multiple boolean variables. The bitset FreeBSD API was used for the bitfield implementation, as it provides flexible structure together with API which also supports atomic bitfield operations. For better readability basic macros from API were wrapped into custom ENA_FLAG_* macros, which are filling up common parameters for all calls. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
af66d7d0 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Add additional doorbells on ENA Tx path The new ENA HAL is introducing API, which can determine on Tx path if the doorbell is needed. That way, it can tell the driver, that it should call an doorbell. The old threshold value wasn't removed, as not all HW is supporting this feature - so it was reworked to also work with the new API. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
82f5a792 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Limit maximum size of Rx refill threshold in ENA The Rx ring size can be as high as 8k. Because of that we want to limit the cleanup threshold by maximum value of 256. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
4fa9e02d |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Add support for the LLQv2 and WC in ENA LLQ (Low Latency Queue) is the feature, that allows pushing header directly to the device through PCI before even DMA is triggered. It reduces latency, because device can start preparing packet before payload is sent through DMA. To speed up sending data through PCI, the Write Combining is enabled, which allows hardware to buffer data before sending them on the PCI - it allows to reduce number of PCI IO operations. ENAv2 is using special descriptor for the negotiation of the LLQ. Currently, only the default configuration is supported. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
5cb9db07 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Lock optimization in ENA Handle IO interrupts using filter routine. That way, the main cleanup task could be moved to the separate thread using taskqueue. The deferred Rx cleanup task was removed, and now the cleanup task is begin called instead. That way, the Rx lock could be removed. In addition, Queue management (wake up and stop TX ring) was added, so the TX cleanup task can be performed mostly lockless. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
6064f289 |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Add tuneable drbr ring size and hw queues depth for ENA The driver now supports per adapter tuning of buffer ring size and HW Rx ring size. It can be achieved using sysctl node dev.ena.X. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
d12f7bfc |
|
30-May-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Check for missing MSI-x and Tx completions in ENA If the first MSI-x won't be executed, then the timer service will detect that and trigger device reset. The checking for missing Tx completion was reworked, so it will also check for missing interrupts. Checking number of missing Tx completions can be performed after loop, instead of checking it every iteration. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
c2e7e247 |
|
21-Mar-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Prevent double activation of admin interrupt in ENA The resource is already being activated in the bus_alloc_resource(), because the flag RF_ACTIVE is being passed. Double activation on arm64 is causing kernel panic. Version of the driver was upgraded to 0.8.4. Submitted by: Michal Krawczyk <mk@semihalf.com> Reported-by: Greg V <greg@unrelenting.technology> Tested-by: cperciva, Greg V <greg@unrelenting.technology> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. Differential revision: https://reviews.freebsd.org/D19655
|
#
1d65b4c0 |
|
15-Feb-2019 |
Marcin Wojtas <mw@FreeBSD.org> |
Do not use ntc for obtaining buffer on Rx in the ENA In out of order mode Rx buffer are accesses by req_id. Accessing and validating mbuf using ntc is causing false error. Increase driver revision after latest RX OOO completion fixes. Submitted by: Rafal Kozik <rk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. MFC after: 1 week
|
#
40abe76b |
|
08-Jul-2018 |
Warner Losh <imp@FreeBSD.org> |
Add PNP info to PCI attachment of ena driver Make unsigned values uint16_t for pnp table. They are properly uint16_t befause they are 16-bit PCI IDs. The PNP_INFO language has no type for bare unsigned. Reviewed by: imp, chuck Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com> Sponsored by: Google, Inc. (GSoC 2018) Pull Request: https://github.com/bsdimp/freebsd/pull/5
|
#
fbb0ed71 |
|
10-May-2018 |
Marcin Wojtas <mw@FreeBSD.org> |
Upgrade ENA version to v0.8.1 Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
4727bda6 |
|
09-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Allow usage of more RX descriptors than 1 in ENA driver Using only 1 descriptor on RX could be an issue, if system would be low on resources and could not provide driver with large chunks of contiguous memory. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: byenduri_gmail.com Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12871
|
#
3cfadb28 |
|
09-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Read max MTU from the ENA device The device now provides driver with max available MTU value it can handle. The function setting MTU for the interface was simplified and reworked to follow up this changes. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: byenduri_gmail.com Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12870
|
#
5a990212 |
|
08-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Cleanup of the ENA driver header file Remove unused macros and fields - some of them were only initialized, without further usage. Implement minor style fixes and add required comments. On the occasion add missing TX completion counter, which was existing, but mistakenly remained unused. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12864
|
#
8805021a |
|
08-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Allow partial MSI-x allocation in ENA driver The situation, where part of the MSI-x was not configured properly, was not properly handled. Now, the driver reduces number of queues to reflect number of existing and properly configured MSI-x vectors. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: byenduri_gmail.com Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12863
|
#
0052f3b5 |
|
08-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Remove deprecated and unused counters in ENA driver Few counters were imported from the Linux driver and never used, because of differences between the Linux and FreeBSD APIs. Queue stops and resumes are no longer supported by the driver and counters were incremented indicating false events. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: rlibby Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12862
|
#
0bdffe59 |
|
09-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Refactor style of the ENA driver * Change all conditional checks in "if" statement to boolean expressions * Initialize variables with too complex values outside the declaration * Fix indentations * Move code associated with sysctls to ena_sysctl.c file * For consistency, remove unnecesary "return" from void functions * Use if_getdrvflags() function instead of accesing variable directly Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12860
|
#
efe6ab18 |
|
09-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Check for Rx ring state to prevent from stall in the ENA driver In case when Rx ring is full and driver will fail to allocate Rx mbufs, the ring could be stalled. Keep alive is checking every second for Rx ring state, and if it is full for two cycles, then trigger rx_cleanup routine in another thread. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: byenduri_gmail.com Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12856
|
#
43fefd16 |
|
09-Nov-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Add RX OOO completion feature The RX out of order completion feature, allows to complete RX descriptors out of order, by keeping trace of all free descriptors in the separate array. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: byenduri_gmail.com Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D12855
|
#
30217e2d |
|
31-Oct-2017 |
Marcin Wojtas <mw@FreeBSD.org> |
Rework counting of hardware statistics in ENA driver Do not read all statistics from the device, instead count them in the driver except from RX drops - they are received directly from the NIC in the AENQ descriptor. Submitted by: Michal Krawczyk <mk@semihalf.com> Reviewed by: imp Obtained from: Semihalf Sponsored by: Amazon.com, Inc. Differential Revision: https://reviews.freebsd.org/D12852
|
#
1b069f1c |
|
03-Jul-2017 |
Zbigniew Bodek <zbb@FreeBSD.org> |
Replace mbuf defragmentation with collapse Collapse should be more effective than defragmentation. Added missing declaration of ena_check_and_collapse_mbuf(). Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon.com Inc.
|
#
8a573700 |
|
03-Jul-2017 |
Zbigniew Bodek <zbb@FreeBSD.org> |
Fix creation of dma tags and TSO settings TSO settings were not reflecting real HW capabilities. DMA tags were created with wrong window - high address was the same as low, so excluding window was not working. Capabilities of TX dma transaction were not set properly - TSO max size had been increased and size of one segment had been adjusted. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon.com Inc.
|
#
b9252a88 |
|
30-May-2017 |
Zbigniew Bodek <zbb@FreeBSD.org> |
Move ENA's hw stats updating routine to separate task Initially, stats were being updated each time OS was requesting for the first statistic. To read statistics from hw, condvar was used. cv_timedwait cannot be called when unsleepable lock is held, and this happens when FreeBSD is requesting statistic. Seperate task is reading statistics from NIC each 1 second. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon.com Inc. Differential revision: https://reviews.freebsd.org/D10926
|
#
1e9fb899 |
|
30-May-2017 |
Zbigniew Bodek <zbb@FreeBSD.org> |
Add mbuf defragmentation to the ENA driver When mbuf chain is too long and device cannot handle that number of segments in DMA transaction, mbuf chain will be defragmented. Initially, driver was dropping all mbuf chains that were exceeding supported number of segments. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon.com Inc. Differential revision: https://reviews.freebsd.org/D10923
|
#
9b8d05b8 |
|
22-May-2017 |
Zbigniew Bodek <zbb@FreeBSD.org> |
Add support for Amazon Elastic Network Adapter (ENA) NIC ENA is a networking interface designed to make good use of modern CPU features and system architectures. The ENA device exposes a lightweight management interface with a minimal set of memory mapped registers and extendable command set through an Admin Queue. The driver supports a range of ENA devices, is link-speed independent (i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has a negotiated and extendable feature set. Some ENA devices support SR-IOV. This driver is used for both the SR-IOV Physical Function (PF) and Virtual Function (VF) devices. ENA devices enable high speed and low overhead network traffic processing by providing multiple Tx/Rx queue pairs (the maximum number is advertised by the device via the Admin Queue), a dedicated MSI-X interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized data placement. The ENA driver supports industry standard TCP/IP offload features such as checksum offload and TCP transmit segmentation offload (TSO). Receive-side scaling (RSS) is supported for multi-core scaling. The ENA driver and its corresponding devices implement health monitoring mechanisms such as watchdog, enabling the device and driver to recover in a manner transparent to the application, as well as debug logs. Some of the ENA devices support a working mode called Low-latency Queue (LLQ), which saves several more microseconds. This feature will be implemented for driver in future releases. Submitted by: Michal Krawczyk <mk@semihalf.com> Jakub Palider <jpa@semihalf.com> Jan Medala <jan@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon.com Inc. Differential revision: https://reviews.freebsd.org/D10427
|