History log of /linux-master/drivers/dma/idxd/Makefile
Revision Date Author Comments
# 979f6ded 05-Dec-2023 Tom Zanussi <tom.zanussi@linux.intel.com>

dmaengine: idxd: Add support for device/wq defaults

Add a load_device_defaults() function pointer to struct
idxd_driver_data, which if defined, will be called when an idxd device
is probed and will allow the idxd device to be configured with default
values.

The load_device_defaults() function is passed an idxd device to work
with to set specific device attributes.

Also add a load_device_defaults() implementation IAA devices; future
patches would add default functions for other device types such as
DSA.

The way idxd device probing works, if the device configuration is
valid at that point e.g. at least one workqueue and engine is properly
configured then the device will be enabled and ready to go.

The IAA implementation, idxd_load_iaa_device_defaults(), configures a
single workqueue (wq0) for each device with the following default
values:

mode "dedicated"
threshold 0
size Total WQ Size from WQCAP
priority 10
type IDXD_WQT_KERNEL
group 0
name "iaa_crypto"
driver_name "crypto"

Note that this now adds another configuration step for any users that
want to configure their own devices/workqueus with something different
in that they'll first need to disable (in the case of IAA) wq0 and the
device itself before they can set their own attributes and re-enable,
since they've been already been auto-enabled. Note also that in order
for the new configuration to be applied to the deflate-iaa crypto
algorithm the iaa_crypto module needs to unregister the old version,
which is accomplished by removing the iaa_crypto module, and
re-registering it with the new configuration by reinserting the
iaa_crypto module.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 88928add 24-Sep-2023 Fenghua Yu <fenghua.yu@intel.com>

dmaengine: idxd: Register dsa_bus_type before registering idxd sub-drivers

idxd sub-drivers belong to bus dsa_bus_type. Thus, dsa_bus_type must be
registered in dsa bus init before idxd drivers can be registered.

But the order is wrong when both idxd and idxd_bus are builtin drivers.
In this case, idxd driver is compiled and linked before idxd_bus driver.
Since the initcall order is determined by the link order, idxd sub-drivers
are registered in idxd initcall before dsa_bus_type is registered
in idxd_bus initcall. idxd initcall fails:

[ 21.562803] calling idxd_init_module+0x0/0x110 @ 1
[ 21.570761] Driver 'idxd' was unable to register with bus_type 'dsa' because the bus was not initialized.
[ 21.586475] initcall idxd_init_module+0x0/0x110 returned -22 after 15717 usecs
[ 21.597178] calling dsa_bus_init+0x0/0x20 @ 1

To fix the issue, compile and link idxd_bus driver before idxd driver
to ensure the right registration order.

Fixes: d9e5481fca74 ("dmaengine: dsa: move dsa_bus_type out of idxd driver to standalone")
Reported-by: Michael Prinke <michael.prinke@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Lijun Pan <lijun.pan@intel.com>
Tested-by: Lijun Pan <lijun.pan@intel.com>
Link: https://lore.kernel.org/r/20230924162232.1409454-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5fbe6503 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmanegine: idxd: add debugfs for event log dump

Add debugfs entry to dump the content of the event log for debugging. The
function will dump all non-zero entries in the event log. It will note
which entries are processed and which entries are still pending processing
at the time of the dump. The entries may not always be in chronological
order due to the log is a circular buffer.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 6e7f3ee9 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: move dsa_drv support to compatible mode

The original architecture of /sys/bus/dsa invented a scheme whereby
a single entry in the list of bus drivers, /sys/bus/drivers/dsa,
handled all device types and internally routed them to different
different drivers. Those internal drivers were invisible to
userspace.

With the idxd driver transitioned to a proper bus device-driver model,
the legacy behavior needs to be preserved due to it being exposed to
user space via sysfs. Create a compat driver to provide the legacy
behavior for /sys/bus/dsa/drivers/dsa. This should satisfy user
tool accel-config v3.2 or ealier where this behavior is expected.
If the distro has a newer accel-config then the legacy mode does
not need to be enabled.

When the compat driver binds the device (i.e. dsa0) to the dsa driver,
it will be bound to the new idxd_drv. The wq device (i.e. wq0.0) will
be bound to either the dmaengine_drv or the user_drv. The dsa_drv
becomes a routing mechansim for the new drivers. It will not support
additional external drivers that are implemented later.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637468705.744545.4399080971745974435.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d9e5481f 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: dsa: move dsa_bus_type out of idxd driver to standalone

In preparation for dsa_drv compat support to be built-in, move the bus
code to its own compilation unit. A follow-on patch adds the compat
implementation. Recall that the compat implementation allows for the
deprecated / omnibus dsa_drv binding scheme rather than the idiomatic
organization of a full fledged bus driver per driver type.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637468142.744545.2811632736881720857.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 81dd4d4d 24-Apr-2021 Tom Zanussi <tom.zanussi@linux.intel.com>

dmaengine: idxd: Add IDXD performance monitor support

Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework. It has several features in
common with the existing uncore pmu support:

- it does not support sampling
- does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

- all general-purpose counters are identical, thus no event constraints
- operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket. IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event. However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it. When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask. This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events. In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function. During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver. The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s). The event(s) are opened and
started, and following termination of the perf command, they're
stopped. At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle. An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category. Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

event 0x1: total input data processed, in 32-byte units
event 0x2: total data written, in 32-byte units
event 0x4: number of work descriptors that read the source
event 0x8: number of work descriptors that write the destination
event 0x10: number of work descriptors dispatched from batch descriptors
event 0x20: number of work descriptors dispatched from work queues

# perf stat -e dsa0/event=0x1,event_category=0x1/,
dsa0/event=0x2,event_category=0x1/,
dsa0/event=0x4,event_category=0x1/,
dsa0/event=0x8,event_category=0x1/,
dsa0/event=0x10,event_category=0x1/,
dsa0/event=0x20,event_category=0x1/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1

Performance counter stats for 'system wide':

5,332 dsa0/event=0x1,event_category=0x1/
5,327 dsa0/event=0x2,event_category=0x1/
19 dsa0/event=0x4,event_category=0x1/
19 dsa0/event=0x8,event_category=0x1/
0 dsa0/event=0x10,event_category=0x1/
19 dsa0/event=0x20,event_category=0x1/

21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example. It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]). In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria. In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count. Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

# perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1

Performance counter stats for 'system wide':

19 dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/

21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 42d279f9 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add char driver to expose submission portal to userland

Create a char device region that will allow acquisition of user portals in
order to allow applications to submit DMA operations. A char device will be
created per work queue that gets exposed. The workqueue type "user"
is used to mark a work queue for user char device. For example if the
workqueue 0 of DSA device 0 is marked for char device, then a device node
of /dev/dsa/wq0.0 will be created.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026985.73301.976523230037106742.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8f47d1a5 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: connect idxd to dmaengine subsystem

Add plumbing for dmaengine subsystem connection. The driver register a DMA
device per DSA device. The channels are dynamically registered when a
workqueue is configured to be "kernel:dmanegine" type.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026376.73301.13867988830650740445.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d1dfe5b8 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add descriptor manipulation routines

This commit adds helper functions for DSA descriptor allocation,
submission, and free operations.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965025757.73301.12692876585357550065.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# c52ca478 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add configuration component of driver

The device is left unconfigured when the driver is loaded. Various
components are configured via the driver sysfs attributes. Once
configuration is done, the device can be enabled by writing the device name
to the bind attribute of the device driver sysfs. Disabling can be done
similarly. Also the individual work queues can also be enabled and disabled
through the bind/unbind attributes. A constructed hierarchy is created
through the struct device framework in order to provide appropriate
configuration points and device state and status. This hierarchy is
presented off the virtual DSA bus.

i.e. /sys/bus/dsa/...

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965024585.73301.6431413676230150589.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# bfe1d560 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Init and probe for Intel data accelerators

The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.

Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.

This commit introduces the probe and initialization component of the
driver.

[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>