History log of /linux-master/drivers/misc/cxl/api.c
Revision Date Author Comments
# d8d2af70 02-Apr-2022 Christophe Leroy <christophe.leroy@csgroup.eu>

cxl/ocxl: Prepare cleanup of powerpc's asm/prom.h

powerpc's asm/prom.h brings some headers that it doesn't
need itself.

In order to clean it up, first add missing headers in
users of asm/prom.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a2bae89b280e7a7cb87889635d9911d6a245e780.1648833388.git.christophe.leroy@csgroup.eu


# 2874c5fd 27-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 52418c8f 25-Mar-2019 David Howells <dhowells@redhat.com>

vfs: Convert cxl to use the new mount API

Convert the cxl filesystem to the new internal mount API as the old
one will be obsoleted and removed. This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 1f58bb18 20-May-2019 Al Viro <viro@zeniv.linux.org.uk>

mount_pseudo(): drop 'name' argument, switch to d_make_root()

Once upon a time we used to set ->d_name of e.g. pipefs root
so that d_path() on pipes would work. These days it's
completely pointless - dentries of pipes are not even connected
to pipefs root. However, mount_pseudo() had set the root
dentry name (passed as the second argument) and callers
kept inventing names to pass to it. Including those that
didn't *have* any non-root dentries to start with...

All of that had been pointless for about 8 years now; it's
time to get rid of that cargo-culting...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 619a6d16 20-May-2019 Al Viro <viro@zeniv.linux.org.uk>

cxl: don't bother with dentry_operations

default will do

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 908f7ca1 09-Jun-2018 Al Viro <viro@zeniv.linux.org.uk>

cxl_getfile(): switch to alloc_file_pseudo()

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# c9c554f2 11-Jul-2018 Al Viro <viro@zeniv.linux.org.uk>

alloc_file(): switch to passing O_... flags instead of FMODE_... mode

... so that it could set both ->f_flags and ->f_mode, without callers
having to set ->f_flags manually.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# d202797f 09-Jun-2018 Al Viro <viro@zeniv.linux.org.uk>

cxl_getfile(): fix double-iput() on alloc_file() failures

Doing iput() after path_put() is wrong.

Cc: stable@vger.kernel.org
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 82c6ae67 27-Jun-2018 Alastair D'Silva <alastair@d-silva.org>

Revert "cxl: Add kernel APIs to get & set the max irqs per context"

Remove abandonned capi support for the Mellanox CX4.

This reverts commit 79384e4b71240abf50c375eea56060b0d79c242a.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 17d29039 27-Jun-2018 Alastair D'Silva <alastair@d-silva.org>

Revert "cxl: Add preliminary workaround for CX4 interrupt limitation"

Remove abandonned capi support for the Mellanox CX4.

This reverts commit cbce0917e2e47d4bf5aa3b5fd6b1247f33e1a126.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 0cfd7335 27-Jun-2018 Alastair D'Silva <alastair@d-silva.org>

Revert "cxl: Add support for interrupts on the Mellanox CX4"

Remove abandonned capi support for the Mellanox CX4.

This reverts commit a2f67d5ee8d950caaa7a6144cf0bfb256500b73e.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# c5828150 27-Jun-2018 Alastair D'Silva <alastair@d-silva.org>

Revert "cxl: Add kernel API to allow a context to operate with relocate disabled"

Remove abandonned capi support for the Mellanox CX4.
The symbol 'cxl_set_translation_mode' is never called, so
ctx->real_mode is always false.

This reverts commit 7a0d85d313c2066712e530e668bc02bb741a685c.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# afc9a42b 03-Jul-2017 Al Viro <viro@zeniv.linux.org.uk>

the rest of drivers/*: annotate ->poll() instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 03b8abed 03-Sep-2017 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Enable global TLBIs for cxl contexts

The PSL and nMMU need to see all TLB invalidations for the memory
contexts used on the adapter. For the hash memory model, it is done by
making all TLBIs global as soon as the cxl driver is in use. For
radix, we need something similar, but we can refine and only convert
to global the invalidations for contexts actually used by the device.

The new mm_context_add_copro() API increments the 'active_cpus' count
for the contexts attached to the cxl adapter. As soon as there's more
than 1 active cpu, the TLBIs for the context become global. Active cpu
count must be decremented when detaching to restore locality if
possible and to avoid overflowing the counter.

The hash memory model support is somewhat limited, as we can't
decrement the active cpus count when mm_context_remove_copro() is
called, because we can't flush the TLB for a mm on hash. So TLBIs
remain global on hash.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Tested-by: Alistair Popple <alistair@popple.id.au>
[mpe: Fold in updated comment on the barrier from Fred]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 197267d0 29-Aug-2017 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Fix driver use count

cxl keeps a driver use count, which is used with the hash memory model
on p8 to know when to upgrade local TLBIs to global and to trigger
callbacks to manage the MMU for PSL8.

If a process opens a context and closes without attaching or fails the
attachment, the driver use count is never decremented. As a
consequence, TLB invalidations remain global, even if there are no
active cxl contexts.

We should increment the driver use count when the process is attaching
to the cxl adapter, and not on open. It's not needed before the
adapter starts using the context and the use count is decremented on
the detach path, so it makes more sense.

It affects only the user api. The kernel api is already doing The
Right Thing.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.2+
Fixes: 7bb5d91a4dda ("cxl: Rework context lifetimes")
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 6dd2d234 07-Apr-2017 Christophe Lombard <clombard@linux.vnet.ibm.com>

cxl: Keep track of mm struct associated with a context

The mm_struct corresponding to the current task is acquired each time
an interrupt is raised. So to simplify the code, we only get the
mm_struct when attaching an AFU context to the process.
The mm_count reference is increased to ensure that the mm_struct can't
be freed. The mm_struct will be released when the context is detached.
A reference on mm_users is not kept to avoid a circular dependency if
the process mmaps its cxl mmio and forget to unmap before exiting.
The field glpid (pid of the group leader associated with the pid), of
the structure cxl_context, is removed because it's no longer useful.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 0e17166d 19-Jan-2017 Greg Kurz <groug@kaod.org>

cxl: Drop unused header asm/pnv-pci.h

The kernel API does not use anything from this header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# bdecf76e 18-Nov-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Fix coredump generation when cxl_get_fd() is used

If a process dumps core while owning a cxl file descriptor obtained
from an AFU driver (e.g. cxlflash) through the cxl_get_fd() API, the
following error occurs:

[ 868.027591] Unable to handle kernel paging request for data at address ...
[ 868.027778] Faulting instruction address: 0xc00000000035edb0
cpu 0x8c: Vector: 300 (Data Access) at [c000003c688275e0]
pc: c00000000035edb0: elf_core_dump+0xd60/0x1300
lr: c00000000035ed80: elf_core_dump+0xd30/0x1300
sp: c000003c68827860
msr: 9000000100009033
dar: c
dsisr: 40000000
current = 0xc000003c68780000
paca = 0xc000000001b73200 softe: 0 irq_happened: 0x01
pid = 46725, comm = hxesurelock
enter ? for help
[c000003c68827a60] c00000000036948c do_coredump+0xcec/0x11e0
[c000003c68827c20] c0000000000ce9e0 get_signal+0x540/0x7b0
[c000003c68827d10] c000000000017354 do_signal+0x54/0x2b0
[c000003c68827e00] c00000000001777c do_notify_resume+0xbc/0xd0
[c000003c68827e30] c000000000009838 ret_from_except_lite+0x64/0x68
--- Exception: 300 (Data Access) at 00003fff98ad2918

The root cause is that the address_space structure for the file
doesn't define a 'host' member.

When cxl allocates a file descriptor, it's using the anonymous inode
to back the file, but allocates a private address_space for each
context. The private address_space allows to track memory allocation
for each context. cxl doesn't define the 'host' member of the address
space, i.e. the inode. We don't want to define it as the anonymous
inode, since there's no longer a 1-to-1 relation between address_space
and inode.

To fix it, instead of using the anonymous inode, we introduce a simple
pseudo filesystem so that cxl can allocate its own inodes. So we now
have one inode for each file and address_space. The pseudo filesystem
is only mounted on the first allocation of a file descriptor by
cxl_get_fd().

Tested with cxlflash.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 28e323e5 30-Oct-2016 Christophe Jaillet <christophe.jaillet@wanadoo.fr>

cxl: Fix error handling in _cxl_cx4_setup_msi_irqs()

'cxl_dev_context_init()' returns an error pointer in case of error, not
NULL. So test it with IS_ERR.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 5cd4f5ce 30-Oct-2016 Christophe Jaillet <christophe.jaillet@wanadoo.fr>

cxl: Fix memory allocation failure test

'cxl_context_alloc()' does not return an error pointer. It is just a
shortcut for a call to 'kzalloc' with 'sizeof(struct cxl_context)' as the
size parameter.

So its return value should be compared with NULL.
While fixing it, simplify a bit the code.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# a05b82d5 21-Oct-2016 Vaibhav Jain <vaibhav@linux.vnet.ibm.com>

cxl: Fix leaking pid refs in some error paths

In some error paths in functions cxl_start_context and
afu_ioctl_start_work pid references to the current & group-leader tasks
can leak after they are taken. This patch fixes these error paths to
release these pid references before exiting the error path.

Fixes: 7b8ad495d592 ("cxl: Fix DSI misses when the context owning task exits")
Cc: stable@vger.kernel.org # v4.5+
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 70b565bb 14-Oct-2016 Vaibhav Jain <vaibhav@linux.vnet.ibm.com>

cxl: Prevent adapter reset if an active context exists

This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card")
Cc: stable@vger.kernel.org # v4.0+
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# a2f67d5e 13-Jul-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Add support for interrupts on the Mellanox CX4

The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
interrupts are routed from the networking hardware to the XSL using the
MSIX table, and from there will be transformed back into an MSIX
interrupt using the cxl style interrupts (i.e. using IVTE entries and
ranges to map a PE and AFU interrupt number to an MSIX address).

We want to hide the implementation details of cxl interrupts as much as
possible. To this end, we use a special version of the MSI setup &
teardown routines in the PHB while in cxl mode to allocate the cxl
interrupts and configure the IVTE entries in the process element.

This function does not configure the MSIX table - the CX4 card uses a
custom format in that table and it would not be appropriate to fill that
out in generic code. The rest of the functionality is similar to the
"Full MSI-X mode" described in the CAIA, and this could be easily
extended to support other adapters that use that mode in the future.

The interrupts will be associated with the default context. If the
maximum number of interrupts per context has been limited (e.g. by the
mlx5 driver), it will automatically allocate additional kernel contexts
to associate extra interrupts as required. These contexts will be
started using the same WED that was used to start the default context.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# cbce0917 13-Jul-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Add preliminary workaround for CX4 interrupt limitation

The Mellanox CX4 has a hardware limitation where only 4 bits of the
AFU interrupt number can be passed to the XSL when sending an interrupt,
limiting it to only 15 interrupts per context (AFU interrupt number 0 is
invalid).

In order to overcome this, we will allocate additional contexts linked
to the default context as extra address space for the extra interrupts -
this will be implemented in the next patch.

This patch adds the preliminary support to allow this, by way of adding
a linked list in the context structure that we use to keep track of the
contexts dedicated to interrupts, and an API to simultaneously iterate
over the related context structures, AFU interrupt numbers and hardware
interrupt numbers. The point of using a single API to iterate these is
to hide some of the details of the iteration from external code, and to
reduce the number of APIs that need to be exported via base.c to allow
built in code to call.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 79384e4b 13-Jul-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Add kernel APIs to get & set the max irqs per context

These APIs will be used by the Mellanox CX4 support. While they function
standalone to configure existing behaviour, their primary purpose is to
allow the Mellanox driver to inform the cxl driver of a hardware
limitation, which will be used in a future patch.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 317f5ef1 13-Jul-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Add support for using the kernel API with a real PHB

This hooks up support for using the kernel API with a real PHB. After
the AFU initialisation has completed it calls into the PHB code to pass
it the AFU that will be used by other peer physical functions on the
adapter.

The cxl_pci_to_afu API is extended to work with peer PCI devices,
retrieving the peer AFU from the PHB. This API may also now return an
error if it is called on a PCI device that is not associated with either
a cxl vPHB or a peer PCI device to an AFU, and this error is propagated
down.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# ad42de85 24-Jun-2016 Michael Neuling <mikey@neuling.org>

cxl: Add set and get private data to context struct

This provides AFU drivers a means to associate private data with a cxl
context. This is particularly intended for make the new callbacks for
driver specific events easier for AFU drivers to use, as they can easily
get back to any private data structures they may use.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# b810253b 23-Jun-2016 Philippe Bergheaud <felix@linux.vnet.ibm.com>

cxl: Add mechanism for delivering AFU driver specific events

This adds an afu_driver_ops structure with fetch_event() and
event_delivered() callbacks. An AFU driver such as cxlflash can fill
this out and associate it with a context to enable passing custom AFU
specific events to userspace.

This also adds a new kernel API function cxl_context_pending_events(),
that the AFU driver can use to notify the cxl driver that new specific
events are ready to be delivered, and wake up anyone waiting on the
context wait queue.

The current count of AFU driver specific events is stored in the field
afu_driver_events of the context structure.

The cxl driver checks the afu_driver_events count during poll, select,
read, etc. calls to check if an AFU driver specific event is pending,
and calls fetch_event() to obtain and deliver that event. This way, the
cxl driver takes care of all the usual locking semantics around these
calls and handles all the generic cxl events, so that the AFU driver
only needs to worry about it's own events.

fetch_event() return a struct cxl_event_afu_driver_reserved, allocated
by the AFU driver, and filled in with the specific event information and
size. Total event size (header + data) should not be greater than
CXL_READ_MIN_SIZE (4K).

Th cxl driver prepends an appropriate cxl event header, copies the event
to userspace, and finally calls event_delivered() to return the status of
the operation to the AFU driver. The event is identified by the context
and cxl_event_afu_driver_reserved pointers.

Since AFU drivers provide their own means for userspace to obtain the
AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file
descriptor to obtain the AFU file descriptor) and the generic cxl driver
will never use this event, the ABI of the event is up to each individual
AFU driver.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 292841b0 23-May-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Update process element after allocating interrupts

In the kernel API, it is possible to attempt to allocate AFU interrupts
after already starting a context. Since the process element structure
used by the hardware is only filled out at the time the context is
started, it will not be updated with the interrupt numbers that have
just been allocated and therefore AFU interrupts will not work unless
they were allocated prior to starting the context.

This can present some difficulties as each CAPI enabled PCI device in
the kernel API has a default context, which may need to be started very
early to enable translations, potentially before interrupts can easily
be set up.

This patch makes the API more flexible to allow interrupts to be
allocated after a context has already been started and takes care of
updating the PE structure used by the hardware and notifying it to
discard any cached copy it may have.

The update is currently performed via a terminate/remove/add sequence.
This is necessary on some hardware such as the XSL that does not
properly support the update LLCMD.

Note that this is only supported on powernv at present - attempting to
perform this ordering on PowerVM will raise a warning.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 7a0d85d3 06-May-2016 Ian Munsie <imunsie@au1.ibm.com>

cxl: Add kernel API to allow a context to operate with relocate disabled

cxl devices typically access memory using an MMU in much the same way as
the CPU, and each context includes a state register much like the MSR in
the CPU. Like the CPU, the state register includes a bit to enable
relocation, which we currently always enable.

In some cases, it may be desirable to allow a device to access memory
using real addresses instead of effective addresses, so this adds a new
API, cxl_set_translation_mode, that can be used to disable relocation
on a given kernel context. This can allow for the creation of a special
privileged context that the device can use if it needs relocation
disabled, and can use regular contexts at times when it needs relocation
enabled.

This interface is only available to users of the kernel API for obvious
reasons, and will never be supported in a virtualised environment.

This will be used by the upcoming cxl support in the mlx5 driver.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# d88e397f 08-Mar-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Remove dead code

Function cxl_get_phys_dev() was removed from the kernel API by a
previous patch, but it's actually dead code. Remove it.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 0d3a13fb 03-Mar-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Remove cxl_get_phys_dev() kernel API

The cxl_get_phys_dev() API returns a struct device pointer which could
belong to either a struct pci_dev (bare-metal) or platform_device
(powerVM). To avoid potential problems in drivers, remove that API. It
was introduced to allow drivers to read the VPD of the adapter, but
the cxl driver now provides the cxl_pci_read_adapter_vpd() API for
that purpose.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# d601ea91 03-Mar-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Support the cxl kernel API from a guest

Like on bare-metal, the cxl driver creates a virtual PHB and a pci
device for the AFU. The configuration space of the device is mapped to
the configuration record of the AFU.

Reuse the code defined in afu_cr_read8|16|32() when reading the
configuration space of the AFU device.

Even though the (virtual) AFU device is a pci device, the adapter is
not. So a driver using the cxl kernel API cannot read the VPD of the
adapter through the usual PCI interface. Therefore, we add a call to
the cxl kernel API:
ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count);

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 14baf4d9 03-Mar-2016 Christophe Lombard <clombard@linux.vnet.ibm.com>

cxl: Add guest-specific code

The new of.c file contains code to parse the device tree to find out
about cxl adapters and AFUs.

guest.c implements the guest-specific callbacks for the backend API.

The process element ID is not known until the context is attached, so
we have to separate the context ID assigned by the cxl driver from the
process element ID visible to the user applications. In bare-metal,
the 2 IDs match.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
[mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 5be587b1 03-Mar-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Introduce implementation-specific API

The backend API (in cxl.h) lists some low-level functions whose
implementation is different on bare-metal and in a guest. Each
environment implements its own functions, and the common code uses
them through function pointers, defined in cxl_backend_ops

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# cca44c01 03-Mar-2016 Frederic Barrat <fbarrat@linux.vnet.ibm.com>

cxl: Define process problem state area at attach time only

CXL kernel API was defining the process problem state area during
context initialization, making it possible to map the problem state
area before attaching the context. This won't work on a powerVM
guest. So force the logical behavior, like in userspace: attach first,
then map the problem state area.
Remove calls to cxl_assign_psn_space during init. The function is
already called on the attach paths.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 7b8ad495 24-Nov-2015 Vaibhav Jain <vaibhav@linux.vnet.ibm.com>

cxl: Fix DSI misses when the context owning task exits

Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
store the pid of the current task_struct and use it to get pointer to
the mm_struct of the process, while processing page or segment faults
from the capi card. However this causes issues when the thread that had
originally issued the start-work ioctl exits in which case the stored
pid is no more valid and the cxl driver is unable to handle faults as
the mm_struct corresponding to process is no more accessible.

This patch fixes this issue by using the mm_struct of the next alive
task in the thread group. This is done by iterating over all the tasks
in the thread group starting from thread group leader and calling
get_task_mm on each one of them. When a valid mm_struct is obtained the
pid of the associated task is stored in the context replacing the
exiting one for handling future faults.

The patch introduces a new function named get_mem_context that checks if
the current task pointed to by ctx->pid is dead? If yes it performs the
steps described above. Also a new variable cxl_context.glpid is
introduced which stores the pid of the thread group leader associated
with the context owning task.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com>
Suggested-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 1b5df59e 15-Nov-2015 Vaibhav Jain <vaibhav@linux.vnet.ibm.com>

cxl: Fix possible idr warning when contexts are released

An idr warning is reported when a context is release after the capi card
is unbound from the cxl driver via sysfs. Below are the steps to
reproduce:

1. Create multiple afu contexts in an user-space application using libcxl.
2. Unbind capi card from cxl using command of form
echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
3. Exit/kill the application owning afu contexts.

After above steps a warning message is usually seen in the kernel logs
of the form "idr_remove called for id=<context-id> which is not
allocated."

This is caused by the function cxl_release_afu which destroys the
contexts_idr table. So when a context is release no entry for context pe
is found in the contexts_idr table and idr code prints this warning.

This patch fixes this issue by increasing & decreasing the ref-count on
the afu device when a context is initialized or when its freed
respectively. This prevents the afu from being released until all the
afu contexts have been released. The patch introduces two new functions
namely cxl_afu_get/put that manage the ref-count on the afu device.

Also the patch removes code inside cxl_dev_context_init that increases ref
on the afu device as its guaranteed to be alive during this function.

Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 8dde152e 29-Sep-2015 Andrew Donnellan <andrew.donnellan@au1.ibm.com>

cxl: fix leak of IRQ names in cxl_free_afu_irqs()

cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ
ranges. The userspace API equivalent in afu_release_irqs() calls
afu_irq_name_free() to release the IRQ names.

Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names.
Make afu_irq_name_free() non-static to allow this.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 55e07668 27-Aug-2015 Ian Munsie <imunsie@au1.ibm.com>

cxl: Fix force unmapping mmaps of contexts allocated through the kernel api

The cxl user api uses the address_space associated with the file when we
need to force unmap all cxl mmap regions (e.g. on eeh, driver detach,
etc). Currently, contexts allocated through the kernel api do not do
this and instead skip the mmap invalidation, potentially allowing them
to poke at the hardware after such an event, which may cause all sorts
of trouble.

This patch allocates an address_space for cxl contexts allocated through
the kernel api so that the same invalidate path will for these contexts
as well. We don't use the anonymous inode's address_space, as doing so
could invalidate any mmaps of completely unrelated drivers using
anonymous file descriptors.

This patch also introduces a kernelapi flag, so we know when freeing the
context if the address_space was allocated by us and needs to be freed.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# af2a50bb 27-Aug-2015 Ian Munsie <imunsie@au1.ibm.com>

cxl: Fix + cleanup error paths in cxl_dev_context_init

If the cxl_context_alloc() call fails, we return immediately without
releasing the reference on the AFU device, allowing it to leak.

This patch switches to using goto style error handling so that the
device is released in common code for both error paths, and will also
simplify things if we add additional initialisation in this function in
the future.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 7c26b9cf 18-Aug-2015 Andrew Donnellan <andrew.donnellan@au1.ibm.com>

cxl: Allow release of contexts which have been OPENED but not STARTED

If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state with a value lower than STARTED, i.e.
OPENED or CLOSED (we can't release a STARTED context as it's currently
using the hardware, and we assume that contexts in any new states which may
be added in future with a value higher than STARTED are also unsafe to
release).

Cc: stable@vger.kernel.org
Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 13e68d8b 14-Aug-2015 Daniel Axtens <dja@axtens.net>

cxl: Allow the kernel to trust that an image won't change on PERST.

Provide a kernel API and a sysfs entry which allow a user to specify
that when a card is PERSTed, it's image will stay the same, allowing
it to participate in EEH.

cxl_reset is used to reflash the card. In that case, we cannot safely
assert that the image will not change. Therefore, disallow cxl_reset
if the flag is set.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 3f8dc44d 06-Jul-2015 Michael Neuling <mikey@neuling.org>

cxl: Fix refcounting in kernel API

Currently the kernel API AFU dev refcounting is done on context start and stop.
This patch moves this refcounting to context init and release, bringing it
inline with how the userspace API does it.

Without this we've seen the refcounting on the AFU get out of whack between the
user and kernel API usage. This causes the AFU structures to be freed when
they are actually still in use.

This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 6f7f0b3d 27-May-2015 Michael Neuling <mikey@neuling.org>

cxl: Add AFU virtual PHB and kernel API

This patch does two things.

Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER
Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB). This
in in addition to the PSL being a PCI device itself.

As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can
provide an AFU configuration. This AFU configuration recored is architected to
be the same as a PCI config space.

This patch sets discovers the AFU configuration records, provides AFU config
space read/write functions to these configuration records. It then enumerates
the PCI bus. It also hooks in PCI ops where appropriate. It also destroys the
vPHB when the physical card is removed.

Secondly, it add an in kernel API for AFU to use CXL. AFUs must present a
driver that firstly binds as a PCI device. This PCI device can then be using
to do CXL specific operations (that can't sit in the PCI ops) using this API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>