History log of /linux-master/drivers/dma/idxd/init.c
Revision Date Author Comments
# d3ea125d 09-Feb-2024 Fenghua Yu <fenghua.yu@intel.com>

dmaengine: idxd: Ensure safe user copy of completion record

If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
event log cache to user triggers a kernel bug.

[ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
[ 1987.170845] ------------[ cut here ]------------
[ 1987.176086] kernel BUG at mm/usercopy.c:102!
[ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
[ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
[ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
[ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
[ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
[ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
[ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
[ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
[ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
[ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
[ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
[ 1987.284710] FS: 0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
[ 1987.293850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
[ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 1987.324527] PKRU: 55555554
[ 1987.327622] Call Trace:
[ 1987.330424] <TASK>
[ 1987.332826] ? show_regs+0x6e/0x80
[ 1987.336703] ? die+0x3c/0xa0
[ 1987.339988] ? do_trap+0xd4/0xf0
[ 1987.343662] ? do_error_trap+0x75/0xa0
[ 1987.347922] ? usercopy_abort+0x72/0x90
[ 1987.352277] ? exc_invalid_op+0x57/0x80
[ 1987.356634] ? usercopy_abort+0x72/0x90
[ 1987.360988] ? asm_exc_invalid_op+0x1f/0x30
[ 1987.365734] ? usercopy_abort+0x72/0x90
[ 1987.370088] __check_heap_object+0xb7/0xd0
[ 1987.374739] __check_object_size+0x175/0x2d0
[ 1987.379588] idxd_copy_cr+0xa9/0x130 [idxd]
[ 1987.384341] idxd_evl_fault_work+0x127/0x390 [idxd]
[ 1987.389878] process_one_work+0x13e/0x300
[ 1987.394435] ? __pfx_worker_thread+0x10/0x10
[ 1987.399284] worker_thread+0x2f7/0x420
[ 1987.403544] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 1987.409171] ? __pfx_worker_thread+0x10/0x10
[ 1987.414019] kthread+0x107/0x140
[ 1987.417693] ? __pfx_kthread+0x10/0x10
[ 1987.421954] ret_from_fork+0x3d/0x60
[ 1987.426019] ? __pfx_kthread+0x10/0x10
[ 1987.430281] ret_from_fork_asm+0x1b/0x30
[ 1987.434744] </TASK>

The issue arises because event log cache is created using
kmem_cache_create() which is not suitable for user copy.

Fix the issue by creating event log cache with
kmem_cache_create_usercopy(), ensuring safe user copy.

Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
Reported-by: Tony Zhu <tony.zhu@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Lijun Pan <lijun.pan@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20240209191412.1050270-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 979f6ded 05-Dec-2023 Tom Zanussi <tom.zanussi@linux.intel.com>

dmaengine: idxd: Add support for device/wq defaults

Add a load_device_defaults() function pointer to struct
idxd_driver_data, which if defined, will be called when an idxd device
is probed and will allow the idxd device to be configured with default
values.

The load_device_defaults() function is passed an idxd device to work
with to set specific device attributes.

Also add a load_device_defaults() implementation IAA devices; future
patches would add default functions for other device types such as
DSA.

The way idxd device probing works, if the device configuration is
valid at that point e.g. at least one workqueue and engine is properly
configured then the device will be enabled and ready to go.

The IAA implementation, idxd_load_iaa_device_defaults(), configures a
single workqueue (wq0) for each device with the following default
values:

mode "dedicated"
threshold 0
size Total WQ Size from WQCAP
priority 10
type IDXD_WQT_KERNEL
group 0
name "iaa_crypto"
driver_name "crypto"

Note that this now adds another configuration step for any users that
want to configure their own devices/workqueus with something different
in that they'll first need to disable (in the case of IAA) wq0 and the
device itself before they can set their own attributes and re-enable,
since they've been already been auto-enabled. Note also that in order
for the new configuration to be applied to the deflate-iaa crypto
algorithm the iaa_crypto module needs to unregister the old version,
which is accomplished by removing the iaa_crypto module, and
re-registering it with the new configuration by reinserting the
iaa_crypto module.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# f5ccf55e 09-Aug-2023 Jacob Pan <jacob.jun.pan@linux.intel.com>

dmaengine/idxd: Re-enable kernel workqueue under DMA API

Kernel workqueues were disabled due to flawed use of kernel VA and SVA
API. Now that we have the support for attaching PASID to the device's
default domain and the ability to reserve global PASIDs from SVA APIs,
we can re-enable the kernel work queues and use them under DMA API.

We also use non-privileged access for in-kernel DMA to be consistent
with the IOMMU settings. Consequently, interrupt for user privilege is
enabled for work completion IRQs.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Tested-by: Tony Zhu <tony.zhu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-9-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 2442b747 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: process batch descriptor completion record faults

Add event log processing for faulting of user batch descriptor completion
record.

When encountering an event log entry for a page fault on a completion
record, the driver is expected to do the following:
1. If the "first error in batch" bit in event log entry error info is
set, discard any previously recorded errors associated with the
"batch identifier".
2. Fix the page fault according to the fault address in the event log. If
successful, write the completion record to the fault address in user space.
3. If an error is encountered while writing the completion record and it is
associated to a descriptor in the batch, the driver associates the error
with the batch identifier of the event log entry and tracks it until the
event log entry for the corresponding batch desc is encountered.

While processing an event log entry for a batch descriptor with error
indicating that one or more descs in the batch had event log entries,
the driver will do the following before writing the batch completion
record:
1. If the status field of the completion record is 0x1, the driver will
change it to error code 0x5 (one or more operations in batch completed
with status not successful) and changes the result field to 1.
2. If the status is error code 0x6 (page fault on batch descriptor list
address), change the result field to 1.
3. If status is any other value, the completion record is not changed.
4. Clear the recorded error in preparation for next batch with same batch
identifier.

The result field is for user software to determine whether to set the
"Batch Error" flag bit in the descriptor for continuation of partial
batch descriptor completion. See DSA spec 2.0 for additional information.

If no error has been recorded for the batch, the batch completion record is
written to user space as is.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# c40bd7d9 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: process user page faults for completion record

DSA supports page fault handling through PRS. However, the DMA engine
that's processing the descriptor is blocked until the PRS response is
received. Other workqueues sharing the engine are also blocked.
Page fault handing by the driver with PRS disabled can be used to
mitigate the stalling.

With PRS disabled while ATS remain enabled, DSA handles page faults on
a completion record by reporting an event in the event log. In this
instance, the descriptor is completed and the event log contains the
completion record address and the contents of the completion record. Add
support to the event log handling code to fault in the completion record
and copy the content of the completion record to user memory.

A bitmap is introduced to keep track of discarded event log entries. When
the user process initiates ->release() of the char device, it no longer is
interested in any remaining event log entries tied to the relevant wq and
PASID. The driver will mark the event log entry index in the bitmap. Upon
encountering the entries during processing, the event log handler will just
clear the bitmap bit and skip the entry rather than attempt to process the
event log entry.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# b022f597 07-Apr-2023 Fenghua Yu <fenghua.yu@intel.com>

dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling

Define idxd_copy_cr() to copy completion record to fault address in
user address that is found by work queue (wq) and PASID.

It will be used to write the user's completion record that the hardware
device is not able to write due to user completion record page fault.

An xarray is added to associate the PASID and mm with the
struct idxd_user_context so mm can be found by PASID and wq.

It is called when handling the completion record fault in a kernel thread
context. Switch to the mm using kthread_use_vm() and copy the
completion record to the mm via copy_to_user(). Once the copy is
completed, switch back to the current mm using kthread_unuse_mm().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# c2f156bf 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: create kmem cache for event log fault items

Add a kmem cache per device for allocating event log fault context. The
context allows an event log entry to be copied and passed to a software
workqueue to be processed. Due to each device can have different sized
event log entry depending on device type, it's not possible to have a
global kmem cache.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5fbe6503 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmanegine: idxd: add debugfs for event log dump

Add debugfs entry to dump the content of the event log for debugging. The
function will dump all non-zero entries in the event log. It will note
which entries are processed and which entries are still pending processing
at the time of the dump. The entries may not always be in chronological
order due to the log is a circular buffer.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 244da66c 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: setup event log configuration

Add setup of event log feature for supported device. Event log addresses
error reporting that was lacking in gen 1 DSA devices where a second error
event does not get reported when a first event is pending software
handling. The event log allows a circular buffer that the device can push
error events to. It is up to the user to create a large enough event log
ring in order to capture the expected events. The evl size can be set in
the device sysfs attribute. By default 64 entries are supported as minimal
when event log is enabled.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 1649091f 07-Apr-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add event log size sysfs attribute

Add support for changing of the event log size. Event log is a
feature added to DSA 2.0 hardware to improve error reporting.
It supersedes the SWERROR register on DSA 1.0 hardware and hope
to prevent loss of reported errors.

The error log size determines how many error entries supported for
the device. It can be configured by the user via sysfs attribute.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 9f0d99b3 03-Mar-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: expose IAA CAP register via sysfs knob

Add IAA (IAX) capability mask sysfs attribute to expose to applications.
The mask provides application knowledge of what capabilities this IAA
device supports. This mask is available for IAA 2.0 device or later.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230303213732.3357494-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 34ca0066 03-Mar-2023 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: reformat swerror output to standard Linux bitmap output

SWERROR register is 4 64bit wide registers. Currently the sysfs attribute
just outputs 4 64bit hex integers. Convert to output with %*pb format
specifier.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230303213732.3357494-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 3c5cc039 07-Mar-2023 Bjorn Helgaas <bhelgaas@google.com>

dmaengine: idxd: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230307192655.874008-3-helgaas@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 84c9ef72 12-Apr-2023 Lu Baolu <baolu.lu@linux.intel.com>

dmaengine: idxd: Add enable/disable device IOPF feature

The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before
and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this
driver.

At present, missing IOPF enabling/disabling doesn't cause any real issue,
because the IOMMU driver places the IOPF enabling/disabling in the path
of SVA feature handling. But this may change.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230324120234.313643-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# fffaed1e 22-Mar-2023 Jacob Pan <jacob.jun.pan@linux.intel.com>

iommu/ioasid: Rename INVALID_IOASID

INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename
INVALID_IOASID and consolidate since we are moving away from IOASID
infrastructure.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 601bdada 27-Jan-2023 Fenghua Yu <fenghua.yu@intel.com>

dmaengine: idxd: Fix default allowed read buffers value in group

Currently default read buffers that is allowed in a group is 0.
grpcfg will be configured to max read buffers that IDXD can support if
the group's allowed read buffers value is 0. But 0 is an invalid
read buffers value and user may get confused when seeing the invalid
initial value 0 through sysfs interface.

To show only valid allowed read buffers value and eliminate confusion,
directly initialize the allowed read buffers to IDXD's max read buffers.
User still can change the value through sysfs interface.

Suggested-by: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230127192855.966929-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 9735bde3 09-Dec-2022 Fenghua Yu <fenghua.yu@intel.com>

dmaengine: idxd: Set traffic class values in GRPCFG on DSA 2.0

On DSA/IAX 1.0, TC-A and TC-B in GRPCFG are set as 1 to have best
performance and cannot be changed through sysfs knobs unless override
option is given.

The same values should be set on DSA 2.0 as well.

Fixes: ea7c8f598c32 ("dmaengine: idxd: restore traffic class defaults after wq reset")
Fixes: ade8a86b512c ("dmaengine: idxd: Set defaults for GRPCFG traffic class")
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20221209172141.562648-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 942fd543 30-Oct-2022 Lu Baolu <baolu.lu@linux.intel.com>

iommu: Remove SVM_FLAG_SUPERVISOR_MODE support

The current kernel DMA with PASID support is based on the SVA with a flag
SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address
space to a PASID of the device. The device driver programs the device with
kernel virtual address (KVA) for DMA access. There have been security and
functional issues with this approach:

- The lack of IOTLB synchronization upon kernel page table updates.
(vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
- Other than slight more protection, using kernel virtual address (KVA)
has little advantage over physical address. There are also no use
cases yet where DMA engines need kernel virtual addresses for in-kernel
DMA.

This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface.
The device drivers are suggested to handle kernel DMA with PASID through
the kernel DMA APIs.

The drvdata parameter in iommu_sva_bind_device() and all callbacks is not
needed anymore. Cleanup them as well.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Link: https://lore.kernel.org/r/20221031005917.45690-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# e8dbd644 30-Sep-2022 Xiaochen Shen <xiaochen.shen@intel.com>

dmaengine: idxd: Fix max batch size for Intel IAA

>From Intel IAA spec [1], Intel IAA does not support batch processing.

Two batch related default values for IAA are incorrect in current code:
(1) The max batch size of device is set during device initialization,
that indicates batch is supported. It should be always 0 on IAA.
(2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32)
as the default value regardless of Intel DSA or IAA device during
work queue setup and cleanup. It should be always 0 on IAA.

Fix the issues by setting the max batch size of device and max batch
size of work queue to 0 on IAA device, that means batch is not
supported.

[1]: https://cdrdv2.intel.com/v1/dl/getContent/721858

Fixes: 23084545dbb0 ("dmaengine: idxd: set max_xfer and max_batch for RO device")
Fixes: 92452a72ebdf ("dmaengine: idxd: set defaults for wq configs")
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# b0325aef 17-Sep-2022 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add WQ operation cap restriction support

DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis.
This means that certain ops can be disabled by the system administrator for
certain wq. By default, all ops are available. A bitmap is used to store
the ops due to total op size of 256 bits and it is more convenient to use a
range list to specify which bits are enabled.

One of the usage to support this is for VM migration between different
iteration of devices. The newer ops are disabled in order to allow guest to
migrate to a host that only support older ops. Another usage is to
restrict the WQ to certain operations for QoS of performance.

A sysfs of ops_config attribute is added per wq. It is only usable when the
ops_config bit is set under WQ_CAP register. This means that this attribute
will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range
list for the bits per operation the WQ supports.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# a8563a33 17-Sep-2022 Dave Jiang <dave.jiang@intel.com>

dmanegine: idxd: reformat opcap output to match bitmap_parse() input

To make input and output consistent and prepping for the per WQ operation
configuration support, change the output of opcap display to match the
input that is expected by bitmap_parse() helper function. The output will
be a bitmap with field width as the number of bits using the %*pb format
specifier for printk() family.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# de5819b9 28-Sep-2022 Jerry Snitselaar <jsnitsel@redhat.com>

dmaengine: idxd: track enabled workqueues in bitmap

Now that idxd_wq_disable_cleanup() sets the workqueue state to
IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been
enabled. This will then be used to determine which workqueues
should be re-enabled when attempting a software reset to recover
from a device halt state.

Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8ffccd11 25-Jun-2022 Jerry Snitselaar <jsnitsel@redhat.com>

dmaengine: idxd: Only call idxd_enable_system_pasid() if succeeded in enabling SVA feature

On a Sapphire Rapids system if boot without intel_iommu=on, the IDXD
driver will crash during probe in iommu_sva_bind_device().

[ 21.423729] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 21.445108] #PF: supervisor read access in kernel mode
[ 21.450912] #PF: error_code(0x0000) - not-present page
[ 21.456706] PGD 0
[ 21.459047] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted 5.19.0-0.rc3.27.eln120.x86_64 #1
[ 21.464011] Hardware name: Intel Corporation EAGLESTREAM/EAGLESTREAM, BIOS EGSDCRB1.SYS.0067.D12.2110190954 10/19/2021
[ 21.464015] Workqueue: events work_for_cpu_fn
[ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0
[ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 02 00 00 <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 91 00 00
[ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296
[ 21.464054] RAX: 0000000000000000 RBX: ff1eadeec8a51000 RCX: 0000000000000000
[ 21.464058] RDX: ff7245d9096b7e24 RSI: 0000000000000000 RDI: ff1eadeec8a510d0
[ 21.464060] RBP: ff1eadeec8a51000 R08: ffffffffb1a12300 R09: ff1eadffbfce25b4
[ 21.464062] R10: ffffffffffffffff R11: 0000000000000038 R12: ffffffffc09f8000
[ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: ff1eaddf54429000
[ 21.464067] FS: 0000000000000000(0000) GS:ff1eadee7f600000(0000) knlGS:0000000000000000
[ 21.464070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 21.464072] CR2: 0000000000000038 CR3: 00000008c0e10006 CR4: 0000000000771ef0
[ 21.464074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 21.464076] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 21.464078] PKRU: 55555554
[ 21.464079] Call Trace:
[ 21.464083] <TASK>
[ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd]
[ 21.464121] local_pci_probe+0x3e/0x80
[ 21.464132] work_for_cpu_fn+0x13/0x20
[ 21.464136] process_one_work+0x1c4/0x380
[ 21.464143] worker_thread+0x1ab/0x380
[ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50
[ 21.464158] ? process_one_work+0x380/0x380
[ 21.464161] kthread+0xe6/0x110
[ 21.464168] ? kthread_complete_and_exit+0x20/0x20
[ 21.464172] ret_from_fork+0x1f/0x30

iommu_sva_bind_device() requires SVA has been enabled successfully on
the IDXD device before it's called. Otherwise, iommu_sva_bind_device()
will access a NULL pointer. If Intel IOMMU is disabled, SVA cannot be
enabled and thus idxd_enable_system_pasid() and iommu_sva_bind_device()
should not be called.

Fixes: 42a1b73852c4 ("dmaengine: idxd: Separate user and kernel pasid enabling")
Cc: Vinod Koul <vkoul@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/dmaengine/20220623170232.6whonfjuh3m5vcoy@cantor/
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220626051648.14249-1-jsnitsel@redhat.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 42a1b738 11-May-2022 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Separate user and kernel pasid enabling

The idxd driver always gated the pasid enabling under a single knob and
this assumption is incorrect. The pasid used for kernel operation can be
independently toggled and has no dependency on the user pasid (and vice
versa). Split the two so they are independent "enabled" flags.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# b6f2f035 15-Jan-2022 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

dmaengine: idxd: Remove useless DMA-32 fallback configuration

As stated in [1], dma_set_mask() with a 64-bit mask never fails if
dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.

Simplify code and remove some dead code accordingly.

[1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/009c80294dba72858cd8a6ed2ed81041df1b1e82.1642231430.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 7ed6f1b8 14-Dec-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: change bandwidth token to read buffers

DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers"
in order to make the concept clearer. Deprecate bandwidth token
naming in the driver and convert to read buffers in order to match with
the spec and reduce confusion when reading the spec.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 403a2e23 13-Dec-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: change MSIX allocation based on per wq activation

Change the driver where WQ interrupt is requested only when wq is being
enabled. This new scheme set things up so that request_threaded_irq() is
only called when a kernel wq type is being enabled. This also sets up for
future interrupt request where different interrupt handler such as wq
occupancy interrupt can be setup instead of the wq completion interrupt.

Not calling request_irq() until the WQ actually needs an irq also prevents
wasting of CPU irq vectors on x86 systems, which is a limited resource.

idxd_flush_pending_descs() is moved to device.c since descriptor flushing
is now part of wq disable rather than shutdown().

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 23a50c80 13-Dec-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix descriptor flushing locking

The descriptor flushing for shutdown is not holding the irq_entry list
lock. If there's ongoing interrupt completion handling, this can corrupt
the list. Add locking to protect list walking. Also refactor the code so
it's more compact.

Fixes: 8f47d1a5e545 ("dmaengine: idxd: connect idxd to dmaengine subsystem")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942148935.2412839.18282664745572777280.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# ec0d6423 13-Dec-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: embed irq_entry in idxd_wq struct

With irq_entry already being associated with the wq in a 1:1 relationship,
embed the irq_entry in the idxd_wq struct and remove back pointers for
idxe_wq and idxd_device. In the process of this work, clean up the interrupt
handle assignment so that there's no decision to be made during submit
call on where interrupt handle value comes from. Set the interrupt handle
during irq request initialization time.

irq_entry 0 is designated as special and is tied to the device itself.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 7930d855 29-Nov-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add knob for enqcmds retries

Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS
descriptor submission. While on host, it is not as likely that ENQCMDS
return busy during normal operations due to the driver controlling the
number of descriptors allocated for submission. However, when the driver is
operating as a guest driver, the chance of retry goes up significantly due
to sharing a wq with multiple VMs. A default value is provided with the
system admin being able to tune the value on a per WQ basis.

Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 92452a72 26-Oct-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: set defaults for wq configs

Add default values for wq size, max_xfer_size and max_batch_size. These
values should provide a general guidance for the wq configuration when
the user does not specify any specific values.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528473483.3926048.7950067926287180976.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 56fc39f5 26-Oct-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: handle interrupt handle revoked event

"Interrupt handle revoked" is an event that happens when the driver is
running on a guest kernel and the VM is migrated to a new machine.
The device will trigger an interrupt that signals to the guest driver
that the interrupt handles need to be replaced.

The misc irq thread function calls a helper function to handle the
event. The function uses the WQ percpu_ref to quiesce the kernel
submissions. It then replaces the interrupt handles by requesting
interrupt handle command for each I/O MSIX vector. Once the handle is
updated, the driver will unblock the submission path to allow new
submissions.

The submitter will attempt to acquire a percpu_ref before submission. When
the request fails, it will wait on the wq_resurrect 'completion'.

The driver does anticipate the possibility of descriptors being submitted
before the WQ percpu_ref is killed. If a descriptor has already been
submitted, it will return with incorrect interrupt handle status. The
descriptor will be re-submitted with the new interrupt handle on the
completion path. For descriptors with incorrect interrupt handles,
completion interrupt won't be triggered.

At the completion of the interrupt handle refresh, the handling function
will call idxd_int_handle_refresh_drain() to issue drain descriptors to
each of the wq with associated interrupt handle. The drain descriptor will have
interrupt request set but without completion record. This will ensure all
descriptors with incorrect interrupt completion handle get drained and
a completion interrupt is triggered for the guest driver to process them.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8b67426e 26-Oct-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: int handle management refactoring

Attach int_handle to irq_entry. This removes the separate management of int
handles and reduces the confusion of interating through int handles that is
off by 1 count.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528417065.3925689.11505755433684476288.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5d78abb6 26-Oct-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: rework descriptor free path on failure

Refactor the completion function to allow skipping of descriptor freeing on
the submission failure path. This completely removes descriptor freeing
from the submit failure path and leave the responsibility to the caller.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528416222.3925689.12859769271667814762.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 98da0106 21-Sep-2021 Dave Jiang <dave.jiang@intel.com>

dmanegine: idxd: fix resource free ordering on driver removal

Fault triggers on ioread32() when pci driver unbind is envoked. The
placement of idxd sub-driver removal causes the probing of the device mmio
region after the mmio mapping being torn down. The driver needs the
sub-drivers to be unbound but not release the idxd context until all
shutdown activities has been done. Move the sub-driver unregistering up
before the remove() calls shutdown(). But take a device ref on the
idxd->conf_dev so that the memory does not get freed in ->release(). When
all cleanup activities has been done, release the ref to allow the idxd
memory to be freed.

[57159.542766] RIP: 0010:ioread32+0x27/0x60
[57159.547097] Code: 00 66 90 48 81 ff ff ff 03 00 77 1e 48 81 ff 00 00 01 00 76 05 0f
b7 d7 ed c3 8b 15 03 50 41 01 b8 ff ff ff ff 85 d2 75 04 c3 <8b> 07 c3 55 83 ea 01 48
89 fe 48 c7 c7 00 70 5f 82 48 89 e5 48 83
[57159.566647] RSP: 0018:ffffc900011abb60 EFLAGS: 00010292
[57159.572295] RAX: ffffc900011e0000 RBX: ffff888107d39800 RCX: 0000000000000000
[57159.579842] RDX: 0000000000000000 RSI: ffffffff82b1e448 RDI: ffffc900011e0090
[57159.587421] RBP: ffffc900011abb88 R08: 0000000000000000 R09: 0000000000000001
[57159.594972] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881019840d0
[57159.602533] R13: ffff8881097e9000 R14: ffffffffa08542a0 R15: 00000000000003a8
[57159.610093] FS: 00007f991e0a8740(0000) GS:ffff888459900000(0000) knlGS:00000000000
00000
[57159.618614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[57159.624814] CR2: ffffc900011e0090 CR3: 000000010862a002 CR4: 00000000003706e0
[57159.632397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[57159.639973] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[57159.647601] Call Trace:
[57159.650502] ? idxd_device_disable+0x41/0x110 [idxd]
[57159.655948] idxd_device_drv_remove+0x2b/0x80 [idxd]
[57159.661374] idxd_config_bus_remove+0x16/0x20
[57159.666191] __device_release_driver+0x163/0x240
[57159.671320] device_release_driver+0x2b/0x40
[57159.676052] bus_remove_device+0xf5/0x160
[57159.680524] device_del+0x19c/0x400
[57159.684440] device_unregister+0x18/0x60
[57159.688792] idxd_remove+0x140/0x1c0 [idxd]
[57159.693406] pci_device_remove+0x3e/0xb0
[57159.697758] __device_release_driver+0x163/0x240
[57159.702788] device_driver_detach+0x43/0xb0
[57159.707424] unbind_store+0x11e/0x130
[57159.711537] drv_attr_store+0x24/0x30
[57159.715646] sysfs_kf_write+0x4b/0x60
[57159.719710] kernfs_fop_write_iter+0x153/0x1e0
[57159.724563] new_sync_write+0x120/0x1b0
[57159.728812] vfs_write+0x23e/0x350
[57159.732624] ksys_write+0x70/0xf0
[57159.736335] __x64_sys_write+0x1a/0x20
[57159.740492] do_syscall_64+0x3b/0x90
[57159.744465] entry_SYSCALL_64_after_hwframe+0x44/0xae
[57159.749908] RIP: 0033:0x7f991e19c387
[57159.753898] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e
fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51
c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[57159.773564] RSP: 002b:00007ffc2ce2d6a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[57159.781550] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f991e19c387
[57159.789133] RDX: 000000000000000c RSI: 000055ee2630e140 RDI: 0000000000000001
[57159.796695] RBP: 000055ee2630e140 R08: 0000000000000000 R09: 00007f991e2324e0
[57159.804246] R10: 00007f991e2323e0 R11: 0000000000000246 R12: 000000000000000c
[57159.811800] R13: 00007f991e26f520 R14: 000000000000000c R15: 00007f991e26f700
[57159.819373] Modules linked in: idxd bridge stp llc bnep sunrpc nls_iso8859_1 intel_
rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_code
c_realtek iTCO_wdt 8250_dw snd_hda_codec_generic kvm_intel ledtrig_audio iTCO_vendor_s
upport snd_hda_intel snd_intel_dspcfg ppdev kvm snd_hda_codec intel_wmi_thunderbolt sn
d_hwdep irqbypass iwlwifi btusb snd_hda_core rapl btrtl intel_cstate snd_seq btbcm snd
_seq_device btintel snd_pcm cfg80211 bluetooth pcspkr psmouse input_leds snd_timer int
el_lpss_pci mei_me intel_lpss snd ecdh_generic ecc mei ucsi_acpi i2c_i801 idma64 i2c_s
mbus virt_dma soundcore typec_ucsi typec wmi parport_pc parport video mac_hid acpi_pad
sch_fq_codel drm ip_tables x_tables crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
usbkbd hid_generic usbmouse aesni_intel usbhid crypto_simd cryptd e1000e hid serio_ra
w ahci libahci pinctrl_sunrisepoint fuse msr autofs4 [last unloaded: idxd]
[57159.904082] CR2: ffffc900011e0090
[57159.907877] ---[ end trace b4e32f49ce9176a4 ]---

Fixes: 49c4959f04b5 ("dmaengine: idxd: fix sequence for pci driver remove() and shutdown()")
Reported-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163225535868.4152687.9318737776682088722.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# ade8a86b 20-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Set defaults for GRPCFG traffic class

Set GRPCFG traffic class to value of 1 for best performance on current
generation of accelerators. Also add override option to allow experimentation.
Sysfs knobs are disabled for DSA/IAX gen1 devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373005.1968485.3761065664382799202.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 6e7f3ee9 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: move dsa_drv support to compatible mode

The original architecture of /sys/bus/dsa invented a scheme whereby
a single entry in the list of bus drivers, /sys/bus/drivers/dsa,
handled all device types and internally routed them to different
different drivers. Those internal drivers were invisible to
userspace.

With the idxd driver transitioned to a proper bus device-driver model,
the legacy behavior needs to be preserved due to it being exposed to
user space via sysfs. Create a compat driver to provide the legacy
behavior for /sys/bus/dsa/drivers/dsa. This should satisfy user
tool accel-config v3.2 or ealier where this behavior is expected.
If the distro has a newer accel-config then the legacy mode does
not need to be enabled.

When the compat driver binds the device (i.e. dsa0) to the dsa driver,
it will be bound to the new idxd_drv. The wq device (i.e. wq0.0) will
be bound to either the dmaengine_drv or the user_drv. The dsa_drv
becomes a routing mechansim for the new drivers. It will not support
additional external drivers that are implemented later.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637468705.744545.4399080971745974435.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d9e5481f 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: dsa: move dsa_bus_type out of idxd driver to standalone

In preparation for dsa_drv compat support to be built-in, move the bus
code to its own compilation unit. A follow-on patch adds the compat
implementation. Recall that the compat implementation allows for the
deprecated / omnibus dsa_drv binding scheme rather than the idiomatic
organization of a full fledged bus driver per driver type.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637468142.744545.2811632736881720857.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 448c3de8 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: create user driver for wq 'device'

The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the
idxd_user_drv driver that controls the enabling and disabling of the
wq and also register and unregister a char device to allow user space
to mmap the descriptor submission portal.

The cdev related bits are moved to the cdev driver probe/remove and out of
the drv_enabe/disable_wq() calls. These bits are exclusive to the cdev
operation and not part of the generic enable/disable of the wq device.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637467578.744545.10203997610072341376.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 0cda4f69 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: create dmaengine driver for wq 'device'

The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the
idxd_dmaengine_drv driver that controls the enabling and disabling of the
wq and also register and unregister the dma channel.

idxd_wq_alloc_resources() and idxd_wq_free_resources() also get moved to
the dmaengine driver. The resources (dma descriptors allocation and setup)
are only used by the dmaengine driver and should only happen when it loads.

The char dev driver (cdev) related bits are left in the __drv_enable_wq()
and __drv_disable_wq() calls to be moved when we split out the char dev
driver just like how the dmaengine driver is split out.

WQ autoload support is not expected currently. With the amount of
configuration needed for the device, the wq is always expected to
be enabled by a tool (or via sysfs) rather than auto enabled at driver
load.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637467033.744545.12330636655625405394.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 034b3290 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: create idxd_device sub-driver

The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the idxd_drv
driver that control the enabling and disabling of the accelerator device.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637466439.744545.15210886092627144577.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5fee6567 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add type to driver in order to allow device matching

Add an array of support device types to the idxd_device_driver
definition in order to enable simple matching of device type to a
given driver. The deprecated / omnibus dsa_drv driver specifies
IDXD_DEV_NONE as its only role is to service legacy userspace (old
accel-config) directed bind requests and route them to them the proper
driver. It need not attach to a device when the bus is autoprobed. The
accel-config tooling is being updated to drop its dependency on this
deprecated bind scheme.

Reviewed-by: Dan Willliams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637465882.744545.17456174666211577867.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# c05257b5 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmanegine: idxd: open code the dsa_drv registration

Don't need a wrapper to register the driver. Just do it directly.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637465319.744545.16325178432532362906.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# f52058ae 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: remove IDXD_DEV_CONF_READY

The IDXD_DEV_CONF_READY state flag is no longer needed. The current
implementation uses this flag to stop the device from doing
configuration until the pci driver probe has completed. With the
driver architecture going towards multiple sub-driver attached to
the dsa_bus, this is no longer feasible. The sub-drivers will be
allowed to probe and return with failure when they are not ready
to complete the probe rather than using a state flag to gate the
probing.

There is no expectation that the devices auto-attach to a driver.
Userspace configuration is expected to setup the device before
enabling.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637460633.744545.8902095097471365420.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 700af3a0 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev

Add a 'struct idxd_dev' that wraps the 'struct device' for idxd conf_dev
that registers with the dsa bus. This is introduced in order to deal with
multiple different types of 'devices' that are registered on the dsa_bus
when the compat driver needs to route them to the correct driver to attach.
The bind() call now can determine the type of device and then do the
appropriate driver matching.

Reviewed-by Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637460065.744545.584492831446090984.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# da5a11d7 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add driver name

Add name field in idxd_device_driver so we don't have to touch the
'struct device_driver' during declaration.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637459517.744545.7572915135318813722.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 3ecfc913 15-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add driver register helper

Add helper functions for dsa-driver registration similar to other
bus-types. In particular, do not require dsa-drivers to open-code the
bus, owner, and mod_name fields. Let registration and unregistration
operate on the 'struct idxd_device_driver' instead of the raw /
embedded 'struct device_driver'.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637458949.744545.14996726325385482050.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 49c4959f 14-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix sequence for pci driver remove() and shutdown()

->shutdown() call should only be responsible for quiescing the device.
Currently it is doing PCI device tear down. This causes issue when things
like MMIO mapping is removed while idxd_unregister_devices() will trigger
removal of idxd device sub-driver and still initiates MMIO writes to the
device. Another issue is with the unregistering of idxd 'struct device',
the memory context gets freed. So the teardown calls are accessing freed
memory and can cause kernel oops. Move all the teardown bits that doesn't
belong in shutdown to ->remove() call. Move unregistering of the idxd
conf_dev 'struct device' to after doing all the teardown to free all
the memory that's no longer needed.

Fixes: 47c16ac27d4c ("dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162629983901.395844.17964803190905549615.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 53b50458 07-Jul-2021 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

dmaengine: idxd: Simplify code and axe the use of a deprecated API

The wrappers in include/linux/pci-dma-compat.h should go away.

Replace 'pci_set_dma_mask/pci_set_consistent_dma_mask' by an equivalent
and less verbose 'dma_set_mask_and_coherent()' call.

Even if the code may look different, it should have exactly the same
run-time behavior.
If pci_set_dma_mask(64) fails and pci_set_dma_mask(32) succeeds, then
pci_set_consistent_dma_mask(64) will also fail.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/70c8a3bc67e41c5fefb526ecd64c5174c1e2dc76.1625720835.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 7eb25da1 14-Jul-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix sequence for pci driver remove() and shutdown()

->shutdown() call should only be responsible for quiescing the device.
Currently it is doing PCI device tear down. This causes issue when things
like MMIO mapping is removed while idxd_unregister_devices() will trigger
removal of idxd device sub-driver and still initiates MMIO writes to the
device. Another issue is with the unregistering of idxd 'struct device',
the memory context gets freed. So the teardown calls are accessing freed
memory and can cause kernel oops. Move all the teardown bits that doesn't
belong in shutdown to ->remove() call. Move unregistering of the idxd
conf_dev 'struct device' to after doing all the teardown to free all
the memory that's no longer needed.

Fixes: 47c16ac27d4c ("dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162629983901.395844.17964803190905549615.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d5c10e0f 24-Jun-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix setup sequence for MSIXPERM table

The MSIX permission table should be programmed BEFORE request_irq()
happens. This prevents any possibility of an interrupt happening before the
MSIX perm table is setup, however slight.

Fixes: 6df0e6c57dfc ("dmaengine: idxd: clear MSIX permission entry on shutdown")
Sign-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162456741222.1138073.1298447364671237896.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 33f9f3c3 09-May-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: remove devm allocation for idxd->int_handles

Allocation of idxd->int_handles was merged incorrectly for the 5.13 merge
window. The devm_kcalloc should've been regular kcalloc due to devm_*
removal series for the driver.

Fixes: eb15e7154fbf ("dmaengine: idxd: add interrupt handle request and release support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162060710518.130816.11349798049329202863.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# ddf742d4 25-May-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Add missing cleanup for early error out in probe call

The probe call stack is missing some cleanup when things fail in the
middle. Add the appropriate cleanup routines to make sure we exit
gracefully.

Fixes: a39c7cd0438e ("dmaengine: idxd: removal of pcim managed mmio mapping")
Reported-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162197061707.392656.15760573520817310791.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 077cdb35 26-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add missing dsa driver unregister

The idxd_unregister_driver() has never been called for the idxd driver upon
removal. Add fix to call unregister driver on module removal.

Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161947994449.1053102.13189942817915448216.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 1c4841cc 26-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add engine 'struct device' missing bus type assignment

engine 'struct device' setup is missing assigning the bus type. Add it to
dsa_bus_type.

Fixes: 75b911309060 ("dmaengine: idxd: fix engine conf_dev lifetime")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161947841562.984844.17505646725993659651.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 74b2fc88 01-Jun-2021 Borislav Petkov <bp@suse.de>

dmaengine: idxd: Use cpu_feature_enabled()

When testing x86 feature bits, use cpu_feature_enabled() so that
build-disabled features can remain off, regardless of what CPUID says.

Fixes: 8e50d392652f ("dmaengine: idxd: Add shared workqueue support")
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-By: Vinod Koul <vkoul@kernel.org>
Cc: <stable@vger.kernel.org>


# 0bde4444 24-Apr-2021 Tom Zanussi <tom.zanussi@linux.intel.com>

dmaengine: idxd: Enable IDXD performance monitor support

Add the code needed in the main IDXD driver to interface with the IDXD
perfmon implementation.

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/a5564a5583911565d31c2af9234218c5166c4b2c.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# a1610461 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: remove MSIX masking for interrupt handlers

Remove interrupt masking and just let the hard irq handler keep
firing for new events. This is less of a performance impact vs
the MMIO readback inside the pci_msi_{mask,unmas}_irq(). Especially
with a loaded system those flushes can be stuck behind large amounts
of MMIO writes to flush. When guest kernel is running on top of VFIO
mdev, mask/unmask causes a vmexit each time and is not desirable.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161894523436.3210025.1834640110556139277.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 53b2ee7f 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: device cmd should use dedicated lock

Create a dedicated lock for device command operations. Put the device
command operation under finer grained locking instead of using the
idxd->dev_lock.

Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894525685.3210132.16160045731436382560.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5b0c68c4 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: support reporting of halt interrupt

Unmask the halt error interrupt so it gets reported to the interrupt
handler. When halt state interrupt is received, quiesce the kernel
WQs and unmap the portals to stop submission.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894441167.3202472.9485946398140619501.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# cf5f86a7 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: enable SVA feature for IOMMU

Enable IOMMU_DEV_FEAT_SVA before attempt to bind pasid. This is needed
according to iommu_sva_bind_device() comment. Currently Intel IOMMU code
does this before bind call. It really needs to be controlled by the driver.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894440621.3202472.17644507396206848134.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# eb15e715 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add interrupt handle request and release support

DSA spec states that when Request Interrupt Handle and Release Interrupt
Handle command bits are set in the CMDCAP register, these device commands
must be supported by the driver.

The interrupt handle is programmed in a descriptor. When Request Interrupt
Handle is not supported, the interrupt handle is the index of the desired
entry in the MSI-X table. When the command is supported, driver must use
the command to obtain a handle to be programmed in the submitted
descriptor.

A requested handle may be revoked. After the handle is revoked, any use of
the handle will result in Invalid Interrupt Handle error.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894439422.3202472.17579543737810265471.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8c66bbdc 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add support for readonly config mode

The read-only configuration mode is defined by the DSA spec as a mode of
the device WQ configuration. When GENCAP register bit 31 is set to 0,
the device is in RO mode and group configuration and some fields of the
workqueue configuration registers are read-only and reflect the fixed
configuration of the device. Add support for RO mode. The driver will
load the values from the registers directly setup all the internally
cached data structures based on the device configuration.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438847.3202472.6317563824045432727.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 93a40a6d 20-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add percpu_ref to descriptor submission path

Current submission path has no way to restrict the submitter from
stop submiting on shutdown path or wq disable path. This provides a way to
quiesce the submission path.

Modeling after 'struct reqeust_queue' usage of percpu_ref. One of the
abilities of per_cpu reference counting is the ability to stop new
references from being taken while awaiting outstanding references to be
dropped. On wq shutdown, we want to block any new submissions to the kernel
workqueue and quiesce before disabling. The percpu_ref allows us to block
any new submissions and wait for any current submission calls to finish
submitting to the workqueue.

A percpu_ref is embedded in each idxd_wq context to allow control for
individual wq. The wq->wq_active counter is elevated before calling
movdir64b() or enqcmds() to submit a descriptor to the wq and dropped once
the submission call completes. The function is gated by
percpu_ref_tryget_live(). On shutdown with percpu_ref_kill() called, any
new submission would be blocked from acquiring a ref and failed. Once all
references are dropped for the wq, shutdown can continue.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438293.3202472.14894701611500822232.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 435b512d 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: remove detection of device type

Move all static data type for per device type to an idxd_driver_data data
structure. The data can be attached to the pci_device_id and provided by
the pci probe function. This removes a lot of unnecessary type detection
and setup code.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852988924.2203940.2787590808682466398.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 4b73e4eb 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: iax bus removal

There is no need to have an additional bus for the IAX device. The removal
of IAX will change user ABI as /sys/bus/iax will no longer exist.
The iax device will be moved to the dsa bus. The device id for dsa and
iax will now be combined rather than unique for each device type in order
to accommodate the iax devices. This is in preparation for fixing the
sub-driver code for idxd. There's no hardware deployment for Sapphire
Rapids platform yet, which means that users have no reason to have
developed scripts against this ABI. There is some exposure to
released versions of accel-config, but those are being fixed up and
an accel-config upgrade is reasonable to get IAX support. As far as
accel-config is concerned IAX support starts when these devices appear
under /sys/bus/dsa, and old accel-config just assumes that an empty /
missing /sys/bus/iax just means a lack of platform support.

Fixes: f25b463883a8 ("dmaengine: idxd: add IAX configuration support in the IDXD driver")
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852988298.2203940.4529909758034944428.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 04922b74 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix cdev setup and free device lifetime issues

The char device setup and cleanup has device lifetime issues regarding when
parts are initialized and cleaned up. The initialization of struct device is
done incorrectly. device_initialize() needs to be called on the 'struct
device' and then additional changes can be added. The ->release() function
needs to be setup via device_type before dev_set_name() to allow proper
cleanup. The change re-parents the cdev under the wq->conf_dev to get
natural reference inheritance. No known dependency on the old device path exists.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852987721.2203940.1478218825576630810.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# defe49f9 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix group conf_dev lifetime

Remove devm_* allocation and fix group->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
group->conf_dev destruction time.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852987144.2203940.8830315575880047.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 75b91130 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix engine conf_dev lifetime

Remove devm_* allocation and fix engine->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
engine conf_dev destruction time.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852986460.2203940.16603218225412118431.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 7c5dd23e 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix wq conf_dev 'struct device' lifetime

Remove devm_* allocation and fix wq->conf_dev 'struct device' lifetime.
Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE. Add release
functions in order to free the allocated memory for the wq context at
device destruction time.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852985907.2203940.6840120734115043753.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 47c16ac2 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Fix idxd->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
appropriate time.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852985319.2203940.4650791514462735368.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# f7f77398 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: use ida for device instance enumeration

The idr is only used for an device id, never to lookup context from that
id. Switch to plain ida.

Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852984730.2203940.15032482460902003819.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# a39c7cd0 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: removal of pcim managed mmio mapping

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove pcim_* management of the PCI device
and the ioremap of MMIO BAR and replace with unmanaged versions. This is
for consistency of removing all the pcim/devm based calls.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852984150.2203940.8043988289748519056.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 5fc8e85f 15-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: cleanup pci interrupt vector allocation management

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove devm managed pci interrupt vectors
and replace with unmanged allocators.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852983563.2203940.8116028229124776669.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 6df0e6c5 12-Apr-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: clear MSIX permission entry on shutdown

Add disabling/clearing of MSIX permission entries on device shutdown to
mirror the enabling of the MSIX entries on probe. Current code left the
MSIX enabled and the pasid entries still programmed at device shutdown.

Fixes: 8e50d392652f ("dmaengine: idxd: Add shared workqueue support")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161824457969.882533.6020239898682672311.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 03d939c7 22-Jan-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add module parameter to force disable of SVA

Add a module parameter that overrides the SVA feature enabling. This keeps
the driver in legacy mode even when intel_iommu=sm_on is set. In this mode,
the descriptor fields must be programmed with dma_addr_t from the Linux DMA
API for source, destination, and completion descriptors.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161134110457.4005461.13171197785259115852.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# e2fcd6e4 24-Dec-2020 Zheng Yongjun <zhengyongjun3@huawei.com>

dma: idxd: use DEFINE_MUTEX() for mutex lock

mutex lock can be initialized automatically with DEFINE_MUTEX()
rather than explicitly calling mutex_init().

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20201224132254.30961-1-zhengyongjun3@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 89e3becd 01-Feb-2021 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: check device state before issue command

Add device state check before executing command. Without the check the
command can be issued while device is in halt state and causes the driver to
block while waiting for the completion of the command.

Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Fixes: 0d5c10b4c84d ("dmaengine: idxd: add work queue drain support")
Link: https://lore.kernel.org/r/161219313921.2976211.12222625226450097465.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# f25b4638 17-Nov-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add IAX configuration support in the IDXD driver

Add support to allow configuration of Intel Analytics Accelerator (IAX) in
addition to the Intel Data Streaming Accelerator (DSA). The IAX hardware
has the same configuration interface as DSA. The main difference
is the type of operations it performs. We can support the DSA and
IAX devices on the same driver with some tweaks.

IAX has a 64B completion record that needs to be 64B aligned, as opposed to
a 32B completion record that is 32B aligned for DSA. IAX also does not
support token management.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160564555488.1834439.4261958859935360473.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 2f8417a9 30-Oct-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: define table offset multiplier

Convert table offset multiplier magic number to a define.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160407311690.839435.6941865731867828234.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# e4f4d8cd 27-Oct-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Clean up descriptors with fault error

Add code to "complete" a descriptor when the descriptor or its completion
address hit a fault error when SVA mode is being used. This error can be
triggered due to bad programming by the user. A lock is introduced in order
to protect the descriptor completion lists since the fault handler will run
from the system work queue after being scheduled in the interrupt handler.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/160382008092.3911367.12766483427643278985.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8e50d392 27-Oct-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Add shared workqueue support

Add shared workqueue support that includes the support of Shared Virtual
memory (SVM) or in similar terms On Demand Paging (ODP). The shared
workqueue uses the enqcmds command in kernel and will respond with retry if
the workqueue is full. Shared workqueue only works when there is PASID
support from the IOMMU.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/160382007499.3911367.26043087963708134.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d98793b5 27-Oct-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix wq config registers offset programming

DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 484f910e 27-Oct-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: fix wq config registers offset programming

DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# e7184b15 28-Aug-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add support for configurable max wq batch size

Add sysfs attribute max_batch_size to wq in order to allow the max batch
size configured on a per wq basis. Add support code to configure
the valid user input on wq enable. This is a performance tuning
parameter.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159865273617.29141.4383066301730821749.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# d7aad555 28-Aug-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add support for configurable max wq xfer size

Add sysfs attribute max_xfer_size to wq in order to allow the max xfer
size configured on a per wq basis. Add support code to configure
the valid user input on wq enable. This is a performance tuning
parameter.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159865265404.29141.3049399618578194052.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 0d5c10b4 26-Jun-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add work queue drain support

Add wq drain support. When a wq is being released, it needs to wait for
all in-flight operation to complete. A device control function
idxd_wq_drain() has been added to facilitate this. A wq drain call
is added to the char dev on release to make sure all user operations are
complete. A wq drain is also added before the wq is being disabled.

A drain command can take an unpredictable period of time. Interrupt support
for device commands is added to allow waiting on the command to
finish. If a previous command is in progress, the new submitter can block
until the current command is finished before proceeding. The interrupt
based submission will submit the command and then wait until a command
completion interrupt happens to complete. All commands are moved to the
interrupt based command submission except for the device reset during
probe, which will be polled.

Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/159319502515.69593.13451647706946040301.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 0705107f 15-Jun-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: move submission to sbitmap_queue

Kill the percpu-rwsem for work submission in favor of an sbitmap_queue.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/159225446631.68253.8860709181621260997.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 42d279f9 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add char driver to expose submission portal to userland

Create a char device region that will allow acquisition of user portals in
order to allow applications to submit DMA operations. A char device will be
created per work queue that gets exposed. The workqueue type "user"
is used to mark a work queue for user char device. For example if the
workqueue 0 of DSA device 0 is marked for char device, then a device node
of /dev/dsa/wq0.0 will be created.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026985.73301.976523230037106742.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# 8f47d1a5 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: connect idxd to dmaengine subsystem

Add plumbing for dmaengine subsystem connection. The driver register a DMA
device per DSA device. The channels are dynamically registered when a
workqueue is configured to be "kernel:dmanegine" type.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026376.73301.13867988830650740445.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# c52ca478 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: add configuration component of driver

The device is left unconfigured when the driver is loaded. Various
components are configured via the driver sysfs attributes. Once
configuration is done, the device can be enabled by writing the device name
to the bind attribute of the device driver sysfs. Disabling can be done
similarly. Also the individual work queues can also be enabled and disabled
through the bind/unbind attributes. A constructed hierarchy is created
through the struct device framework in order to provide appropriate
configuration points and device state and status. This hierarchy is
presented off the virtual DSA bus.

i.e. /sys/bus/dsa/...

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965024585.73301.6431413676230150589.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>


# bfe1d560 21-Jan-2020 Dave Jiang <dave.jiang@intel.com>

dmaengine: idxd: Init and probe for Intel data accelerators

The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.

Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.

This commit introduces the probe and initialization component of the
driver.

[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>