#
001c5d19 |
|
03-Apr-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord The driver stores access_coordinate for host bridge in ->hb_coord and switch CDAT access_coordinate in ->sw_coord. Since neither of these access_coordinate clobber each other, the variable name can be consolidated into ->coord to simplify the code. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
#
51293c56 |
|
03-Apr-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Fix incorrect region perf data calculation Current math in cxl_region_perf_data_calculate divides the latency by 1000 every time the function gets called. This causes the region latency to be divided by 1000 per memory device and the math is incorrect. This is user visible as the latency access_coordinate exposed via sysfs will show incorrect latency data. Normalize values from CDAT to nanoseconds. Adjust sub-nanoseconds latency to at least 1. Remove adjustment of perf numbers from the generic target since hmat handling code has already normalized those numbers. Now all computation and stored numbers should be in nanoseconds. cxl_hb_get_perf_coordinates() is removed and HB coords are calculated in the port access_coordinate calculation path since it no longer need to be treated special. Fixes: 3d9f4a197230 ("cxl/region: Calculate performance data for a region") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
#
067353a4 |
|
08-Mar-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl/region: Add memory hotplug notifier for cxl region When the CXL region is formed, the driver computes the performance data for the region. However this data is not available at the node data collection that has been populated by the HMAT during kernel initialization. Add a memory hotplug notifier to update the access coordinates to the 'struct memory_target' context kept by the HMAT_REPORTING code. Add CXL_CALLBACK_PRI for a memory hotplug callback priority. Set the priority number to be called before HMAT_CALLBACK_PRI. The CXL update must happen before hmat_callback(). A new HMAT_REPORTING helper hmat_update_target_coordinates() is added in order to allow CXL to update the memory_target access coordinates. A new ext_updated member is added to the memory_target to indicate that the access coordinates within the memory_target has been updated by an external agent such as CXL. This prevents data being overwritten by the hmat_update_target_attrs() triggered by hmat_callback(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Huang, Ying <ying.huang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-12-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3d9f4a19 |
|
08-Mar-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl/region: Calculate performance data for a region Calculate and store the performance data for a CXL region. Find the worst read and write latency for all the included ranges from each of the devices that attributes to the region and designate that as the latency data. Sum all the read and write bandwidth data for each of the device region and that is the total bandwidth for the region. The perf list is expected to be constructed before the endpoint decoders are registered and thus there should be no early reading of the entries from the region assemble action. The calling of the region qos calculate function is under the protection of cxl_dpa_rwsem and will ensure that all DPA associated work has completed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-10-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
863027d4 |
|
08-Mar-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Split out host bridge access coordinates The difference between access class 0 and access class 1 for 'struct access_coordinate', if any, is that class 0 is for the distance from the target to the closest initiator and that class 1 is for the distance from the target to the closest CPU. For CXL memory, the nearest initiator may not necessarily be a CPU node. The performance path from the CXL endpoint to the host bridge should remain the same. However, the numbers extracted and stored from HMAT is the difference for the two access classes. Split out the performance numbers for the host bridge (generic target) from the calculation of the entire path in order to allow calculation of both access classes for a CXL region. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-7-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
032f7b37 |
|
08-Mar-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Split out combine_coordinates() for common shared usage Refactor the common code of combining coordinates in order to reduce code. Create a new function cxl_cooordinates_combine() it combine two 'struct access_coordinate'. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-6-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
bd98cbbb |
|
08-Mar-2024 |
Dave Jiang <dave.jiang@intel.com> |
ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access classes Update acpi_get_genport_coordinates() to allow retrieval of both access classes of the 'struct access_coordinate' for a generic target. The update will allow CXL code to compute access coordinates for both access class. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-5-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
cc214417 |
|
06-Feb-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Fix sysfs export of qos_class for memdev Current implementation exports only to /sys/bus/cxl/devices/.../memN/qos_class. With both ram and pmem exposed, the second registered sysfs attribute is rejected as duplicate. It's not possible to create qos_class under the dev_groups via the driver due to the ram and pmem sysfs sub-directories already created by the device sysfs groups. Move the ram and pmem qos_class to the device sysfs groups and add a call to sysfs_update() after the perf data are validated so the qos_class can be visible. The end results should be /sys/bus/cxl/devices/.../memN/ram/qos_class and /sys/bus/cxl/devices/.../memN/pmem/qos_class. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-4-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
44cd71ef |
|
05-Jan-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Convert find_cxl_root() to return a 'struct cxl_root *' Commit 790815902ec6 ("cxl: Add support for _DSM Function for retrieving QTG ID") introduced 'struct cxl_root', however all usages have been worked indirectly through cxl_port. Refactor code such as find_cxl_root() function to use 'struct cxl_root' directly. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170449246044.3779673.13035770941393418591.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
98856b2e |
|
05-Jan-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Introduce put_cxl_root() helper Add a helper function put_cxl_root() to maintain symmetry for find_cxl_root() function instead of relying on open coding of the put_device() in order to dereference the 'struct device' that happens via get_device() in find_cxl_root(). Suggested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/170449245417.3779673.4566146351673989387.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5459e186 |
|
21-Dec-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Fix missing target list lock cxl_port_setup_targets() modifies the ->targets[] array of a switch decoder. target_list_show() expects to be able to emit a coherent snapshot of that array by "holding" ->target_lock for read. The target_lock is held for write during initialization of the ->targets[] array, but it is not held for write during cxl_port_setup_targets(). The ->target_lock() predates the introduction of @cxl_region_rwsem. That semaphore protects changes to host-physical-address (HPA) decode which is precisely what writes to a switch decoder's target list affects. Replace ->target_lock with @cxl_region_rwsem. Now the side-effect of snapshotting a unstable view of a decoder's target list is likely benign so the Fixes: tag is presumptive. Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
14a6960b |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Add helper function that calculate performance data for downstream ports The CDAT information from the switch, Switch Scoped Latency and Bandwidth Information Structure (SSLBIS), is parsed and stored under a cxl_dport based on the correlated downstream port id from the SSLBIS entry. Walk the entire CXL port paths and collect all the performance data. Also pick up the link latency number that's stored under the dports. The entire path PCIe bandwidth can be retrieved using the pcie_bandwidth_available() call. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319623824.2212653.10302079766473698427.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
1037b82f |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Store the access coordinates for the generic ports Each CXL host bridge is represented by an ACPI0016 device. A generic port device handle that is an ACPI device is represented by a string of ACPI0016 device HID and UID. Create a device handle from the ACPI device and retrieve the access coordinates from the stored memory targets. The access coordinates are stored under the cxl_dport that is associated with the CXL host bridge. The access coordinates struct is dynamically allocated under cxl_dport in order for code later on to detect whether the data exists or not. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319623196.2212653.17916695743464172534.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4d07a053 |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Calculate and store PCI link latency for the downstream ports The latency is calculated by dividing the flit size over the bandwidth. Add support to retrieve the flit size for the CXL switch device and calculate the latency of the PCIe link. Cache the latency number with cxl_dport. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319621931.2212653.6800240203604822886.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
79081590 |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Add support for _DSM Function for retrieving QTG ID CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM) Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires an input of an ACPI package with 4 dwords (read latency, write latency, read bandwidth, write bandwidth). The call returns a package with 1 WORD that provides the max supported QTG ID and a package that may contain 0 or more WORDs as the recommended QTG IDs in the recommended order. Create a cxl_root container for the root cxl_port and provide a callback ->get_qos_class() in order to retrieve the QoS class. For the ACPI case, the _DSM helper is used to retrieve the QTG ID and returned. A devm_cxl_add_root() function is added for root port setup and registration of the cxl_root callback operation(s). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/170319621294.2212653.1649682083061569256.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
80aa780d |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Add callback to parse the SSLBIS subtable from CDAT Provide a callback to parse the Switched Scoped Latency and Bandwidth Information Structure (SSLBIS) in the CDAT structures. The SSLBIS contains the bandwidth and latency information that's tied to the CXL switch that the data table has been read from. The extracted values are stored to the cxl_dport correlated by the port_id depending on the SSLBIS entry. Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency and Bandwidth Information Structure (DSLBIS) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319620635.2212653.5194389158785365150.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
ad6f04c0 |
|
21-Dec-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Add callback to parse the DSMAS subtables from CDAT Provide a callback function to the CDAT parser in order to parse the Device Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the DPA range and its associated attributes in each entry. See the CDAT specification for details. The device handle and the DPA range is saved and to be associated with the DSLBIS locality data when the DSLBIS entries are parsed. The xarray is a local variable. When the total path performance data is calculated and storred this xarray can be discarded. Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity Structure (DSMAS) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319619355.2212653.2675953129671561293.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
529c0a44 |
|
12-Oct-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Export the QoS Throttling Group ID from the CXL Fixed Memory Window Structure (CFMWS) under the root decoder sysfs attributes as qos_class. CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS) cxl cli will use this id to match with the _DSM retrieved id for a hot-plugged CXL memory device DPA memory range to make sure that the DPA range is under the right CFMWS window. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713681699.2205276.14475306324720093079.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
458ba818 |
|
16-Oct-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Add cxl_decoders_committed() helper Add a helper to retrieve the number of decoders committed for the port. Replace all the open coding of the calculation with the helper. Link: https://lore.kernel.org/linux-cxl/651c98472dfed_ae7e729495@dwillia2-xfh.jf.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/169747906849.272156.1729290904857372335.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e8db0701 |
|
18-Oct-2023 |
Robert Richter <rrichter@amd.com> |
cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm struct cxl_register_map carries a @dev parameter for devm operations. Simplify the function interface to use that instead of a separate @dev argument. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-21-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6c5f3aac |
|
18-Oct-2023 |
Terry Bowman <terry.bowman@amd.com> |
cxl/pci: Map RCH downstream AER registers for logging protocol errors The restricted CXL host (RCH) error handler will log protocol errors using AER and RAS status registers. The AER and RAS registers need to be virtually memory mapped before enabling interrupts. Create the initializer function devm_cxl_setup_parent_dport() for this when the endpoint is connected with the dport. The initialization sets up the RCH RAS and AER mappings. Add 'struct cxl_regs' to 'struct cxl_dport' for saving a pointer to the RCH downstream port's AER and RAS registers. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-15-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
f05fd10d |
|
27-Oct-2023 |
Robert Richter <rrichter@amd.com> |
cxl/pci: Add RCH downstream port AER register discovery Restricted CXL host (RCH) downstream port AER information is not currently logged while in the error state. One problem preventing the error logging is the AER and RAS registers are not accessible. The CXL driver requires changes to find RCH downstream port AER and RAS registers for purpose of error logging. RCH downstream ports are not enumerated during a PCI bus scan and are instead discovered using system firmware, ACPI in this case.[1] The downstream port is implemented as a Root Complex Register Block (RCRB). The RCRB is a 4k memory block containing PCIe registers based on the PCIe root port.[2] The RCRB includes AER extended capability registers used for reporting errors. Note, the RCH's AER Capability is located in the RCRB memory space instead of PCI configuration space, thus its register access is different. Existing kernel PCIe AER functions can not be used to manage the downstream port AER capabilities and RAS registers because the port was not enumerated during PCI scan and the registers are not PCI config accessible. Discover RCH downstream port AER extended capability registers. Use MMIO accesses to search for extended AER capability in RCRB register space. [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-12-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a2fcb84a |
|
18-Oct-2023 |
Robert Richter <rrichter@amd.com> |
cxl/port: Remove Component Register base address from struct cxl_port The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @reg_map instead. Remove the base address. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-10-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d8add492 |
|
18-Oct-2023 |
Robert Richter <rrichter@amd.com> |
cxl/port: Rename @comp_map to @reg_map in struct cxl_register_map Name the field @reg_map, because @reg_map->host will be used for mapping operations beyond component registers (i.e. AER registers). This is valid for all occurrences of @comp_map. Change them all. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-5-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
dd22581f |
|
18-Oct-2023 |
Robert Richter <rrichter@amd.com> |
cxl/core/regs: Rename @dev to @host in struct cxl_register_map The primary role of @dev is to host the mappings for devm operations. @dev is too ambiguous as a name. I.e. when does @dev refer to the 'struct device *' instance that the registers belong, and when does @dev refer to the 'struct device *' instance hosting the mapping for devm operations? Clarify the role of @dev in cxl_register_map by renaming it to @host. Also, rename local variables to 'host' where map->host is used. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8f0220af |
|
15-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
Revert "cxl/port: Enable the HDM decoder capability for switch ports" commit eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") ...was added on the observation of CXL memory not being accessible after setting up a region on a "cold-plugged" device. A "cold-plugged" CXL device is one that was not present at boot, so platform-firmware/BIOS has no chance to set it up. While it is true that the debug found the enable bit clear in the host-bridge's instance of the global control register (CXL 3.0 8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is described as: "This bit is only applicable to CXL.mem devices and shall return 0 on CXL Host Bridges and Upstream Switch Ports." So it is meant to be zero, and further testing confirmed that this "fix" had no effect on the failure. Revert it, and be more vigilant about proposed fixes in the future. Since the original copied stable@, flag this revert for stable@ as well. Cc: <stable@vger.kernel.org> Fixes: eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
cecbb5da |
|
14-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM In preparation for device-memory region creation, arrange for decoders of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their target type. Revisit this if a device ever shows up that wants to offer mixed HDM-H (Host-Only Memory) and HDM-DB support, or an CXL_DEVTYPE_DEVMEM device that supports HDM-H. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261945.3436160.11673393474107374595.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5aa39a91 |
|
14-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM} In preparation for support for HDM-D and HDM-DB configuration (device-memory, and device-memory with back-invalidate). Rename the current type designators to use HOSTONLYMEM and DEVMEM as a suffix. HDM-DB can be supported by devices that are not accelerators, so DEVMEM is a more generic term for that case. Fixup one location where this type value was open coded. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
688baac1 |
|
14-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output The @map parameter to cxl_probe_X_registers() is filled in with the mapping parameters of the register block. The @map parameter to cxl_map_X_registers() only reads that information to perform the mapping. Mark @map const for cxl_map_X_registers() to clarify that it is only an input to those helpers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679258103.3436160.4941603739448763855.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2ab47045 |
|
16-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Flag partially torn down regions as unusable cxl_region_decode_reset() walks all the decoders associated with a given region and disables them. Due to decoder ordering rules it is possible that a switch in the topology notices that a given decoder can not be shutdown before another region with a higher HPA is shutdown first. That can leave the region in a partially committed state. Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require that a successful cxl_region_decode_reset() attempt must be completed before cxl_region_probe() accepts the region. This is a corollary for the bug that Jonathan identified in "CXL/region : commit reset of out of order region appears to succeed." [1]. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696507423.3590522.16254212607926684429.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d1257d09 |
|
16-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Move cache invalidation before region teardown, and before setup Vikram raised a concern with the theoretical case of a CPU sending MemClnEvict to a device that is not prepared to receive. MemClnEvict is a message that is sent after a CPU has taken ownership of a cacheline from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder reconfiguration it is possible that the CPU is holding old contents for a new device that has taken over the physical address range being cached by the CPU. To avoid this scenario, invalidate caches prior to tearing down an HDM decoder configuration. Now, this poses another problem that it is possible for something to speculate into that space while the decode configuration is still up, so to close that gap also invalidate prior to establish new contents behind a given physical address range. With this change the cache invalidation is now explicit and need not be checked in cxl_region_probe(), and that obviates the need for CXL_REGION_F_INCOHERENT. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Fixes: d18bc74aced6 ("cxl/region: Manage CPU caches relative to DPA invalidation events") Reported-by: Vikram Sethi <vsethi@nvidia.com> Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696506886.3590522.4597053660991916591.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5d2ffbe4 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport Same as for ports, also store the downstream port's Component Register mappings, use struct cxl_dport for that. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-16-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
19ab69a6 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/port: Store the port's Component Register mappings in struct cxl_port CXL capabilities are stored in the Component Registers. To use them, the specific I/O ranges of the capabilities must be determined by probing the registers. For this, the whole Component Register range needs to be mapped temporarily to detect the offset and length of a capability range. In order to use more than one capability of a component (e.g. RAS and HDM) the Component Register are probed and its mappings created multiple times. This also causes overlapping I/O ranges as the whole Component Register range must be mapped again while a capability's I/O range is already mapped. Different capabilities cannot be setup at the same time. E.g. the RAS capability must be made available as soon as the PCI driver is bound, the HDM decoder is setup later during port enumeration. Moreover, during early setup it is still unknown if a certain capability is needed. A central capability setup is therefore not possible, capabilities must be individually enabled once needed during initialization. To avoid a duplicate register probe and overlapping I/O mappings, only probe the Component Registers one time and store the Component Register mapping in struct port. The stored mappings can be used later to iomap the capability register range when enabling the capability, which will be implemented in a follow-on patch. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-15-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
733b57f2 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/pci: Early setup RCH dport component registers from RCRB CXL RAS capabilities must be enabled and accessible as soon as the CXL endpoint is detected in the PCI hierarchy and bound to the cxl_pci driver. This needs to be independent of other modules such as cxl_port or cxl_mem. CXL RAS capabilities reside in the Component Registers. For an RCH this is determined by probing RCRB which is implemented very late once the CXL Memory Device is created. Change this by moving the RCRB probe to the cxl_pci driver. Do this by using a new introduced function cxl_pci_find_port() similar to cxl_mem_find_port() to determine the involved dport by the endpoint's PCI handle. Plug this into the existing cxl_pci_setup_regs() function to setup Component Registers. Probe the RCRB in case the Component Registers cannot be located through the CXL Register Locator capability. This unifies code and early sets up the Component Registers at the same time for both, VH and RCH mode. Only the cxl_pci driver is involved for this. This allows an early mapping of the CXL RAS capability registers. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-14-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d8bffff2 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/port: Remove Component Register base address from struct cxl_dport The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @comp_map instead. Remove the base address. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-11-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d076bb8c |
|
22-Jun-2023 |
Terry Bowman <terry.bowman@amd.com> |
cxl/pci: Refactor component register discovery for reuse The endpoint implements component register setup code. Refactor it for reuse with RCRB, downstream port, and upstream port setup. Move PCI specifics from cxl_setup_regs() into cxl_pci_setup_regs(). Move cxl_setup_regs() into cxl/core/regs.c and export it. This also includes supporting static functions cxl_map_registerblock(), cxl_unmap_register_block() and cxl_probe_regs(). Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-8-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
57340804 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/core/regs: Add @dev to cxl_register_map The corresponding device of a register mapping is used for devm operations and logging. For operations with struct cxl_register_map the device needs to be kept track separately. To simpify the involved function interfaces, add @dev to cxl_register_map. While at it also reorder function arguments of cxl_map_device_regs() and cxl_map_component_regs() to have the object @cxl_register_map first. As a result a bunch of functions are available to be used with a @cxl_register_map object. This patch is in preparation of reworking the component register setup code. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-7-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7481653d |
|
22-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl: Rename 'uport' to 'uport_dev' For symmetry with the recent rename of ->dport_dev for a 'struct cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct cxl_port'. These devices represent the downstream-port-device and upstream-port-device respectively in the CXL/PCIe topology. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
227db574 |
|
22-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl: Rename member @dport of struct cxl_dport to @dport_dev Reading code like dport->dport does not immediately suggest that this points to the corresponding device structure of the dport. Rename struct member @dport to @dport_dev. While at it, also rename @new argument of add_dport() to @dport. This better describes the variable as a dport (e.g. new->dport becomes to dport->dport_dev). Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-5-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
06193378 |
|
22-Jun-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability Prepare cxl_probe_rcrb() for retrieving more than just the component register block. The RCH AER handling code wants to get back to the AER capability that happens to be MMIO mapped rather then configuration cycles. Move RCRB specific downstream port data, like the RCRB base and the AER capability offset, into its own data structure ('struct cxl_rcrb_info') for cxl_probe_rcrb() to fill. Extend 'struct cxl_dport' to include a 'struct cxl_rcrb_info' attribute. This centralizes all RCRB scanning in one routine. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-4-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
eb4663b0 |
|
25-Jun-2023 |
Robert Richter <rrichter@amd.com> |
cxl/acpi: Probe RCRB later during RCH downstream port creation The RCRB is extracted already during ACPI CEDT table parsing while the data of this is needed not earlier than dport creation. This implementation comes with drawbacks: During ACPI table scan there is already MMIO access including mapping and unmapping, but only ACPI data should be collected here. The collected data must be transferred through a couple of interfaces until it is finally consumed when creating the dport. This causes complex data structures and function interfaces. Additionally, RCRB parsing will be extended to also extract AER data, it would be much easier do this at a later point during port and dport creation when the data structures are available to hold that data. To simplify all that, probe the RCRB at a later point during RCH downstream port creation. Change ACPI table parser to only extract the base address of either the component registers or the RCRB. Parse and extract the RCRB in devm_cxl_add_rch_dport(). This is in preparation to centralize all RCRB scanning. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-2-terry.bowman@amd.com Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20230622205523.85375-3-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
1ad3f701 |
|
26-May-2023 |
Jonathan Cameron <Jonathan.Cameron@huawei.com> |
cxl/pci: Find and register CXL PMU devices CXL PMU devices can be found from entries in the Register Locator DVSEC. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230526095824.16336-4-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d717d7f3 |
|
26-May-2023 |
Jonathan Cameron <Jonathan.Cameron@huawei.com> |
cxl: Add functions to get an instance of / count regblocks of a given type Until the recently release CXL 3.0 specification, there was only ever one instance of any given register block pointed to by the Register Block Locator DVSEC. Now, the specification allows for multiple CXL PMU instances, each with their own register block. To enable this add cxl_find_regblock_instance() that takes an index parameter and use that to implement cxl_count_regblock() and cxl_find_regblock(). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230526095824.16336-3-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
ccadf131 |
|
23-May-2023 |
Davidlohr Bueso <dave@stgolabs.net> |
cxl/mbox: Add background cmd handling machinery This adds support for handling background operations, as defined in the CXL 3.0 spec. Commands that can take too long (over ~2 seconds) can run in the background asynchronously (to the hardware). The driver will deal with such commands synchronously, blocking all other incoming commands for a specified period of time, allowing time-slicing the command such that the caller can send incremental requests to avoid monopolizing the driver/device. Any out of sync (timeout) between the driver and hardware is just disregarded as an invalid state until the next successful submission. Such timeouts are considered a rare occurrence, either a real device problem or a driver issue that needs to reduce the size of the background operation to fit the timeout. On devices where mbox interrupts are supported, this will still use a poller that will wakeup in the specified wait intervals. The irq handler will simply awake the blocked cmd, which is also safe vs a task that is either waking (timing out) or already awoken. Similarly any irq setup error during the probing falls back to polling, thus avoids unnecessarily erroring out. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
eb0764b8 |
|
17-May-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Enable the HDM decoder capability for switch ports Derick noticed, when testing hot plug, that hot-add behaves nominally after a removal. However, if the hot-add is done without a prior removal, CXL.mem accesses fail. It turns out that the original implementation of the port driver and region programming wrongly assumed that platform-firmware always enables the host-bridge HDM decoder capability. Add support turning on switch-level HDM decoders in the case where platform-firmware has not. The implementation is careful to only arrange for the enable to be undone if the current instance of the driver was the one that did the enable. This is to interoperate with platform-firmware that may expect CXL.mem to remain active after the driver is shutdown. This comes at the cost of potentially not shutting down the enable on kexec flows, but it is mitigated by the fact that the related HDM decoders still need to be enabled on an individual basis. Cc: <stable@vger.kernel.org> Reported-by: Derick Marks <derick.w.marks@intel.com> Fixes: 54cdbf845cf7 ("cxl/port: Add a driver for 'struct cxl_port' objects") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d35b495d |
|
03-Apr-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Fix find_cxl_root() for RCDs and simplify it The find_cxl_root() helper is used to lookup root decoders and other CXL platform topology information for a given endpoint. It turns out that for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev) is always NULL for the RCH topology case because it expects to find a cxl_port at the host-bridge. RCH topologies only have the root cxl_port object with the host-bridge as a dport. While there are no reports of this being a problem to date, by inspection region enumeration should crash as a result of this problem, and it does in a local unit test for this scenario. However, an observation that ever since: commit f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") ...all callers of find_cxl_root() occur after the memdev connection to the port topology has been established. That means that find_cxl_root() can be simplified to a walk of the endpoint port topology to the root. Switch to that arrangement which also fixes the RCD bug. Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b70c2cf9 |
|
03-Apr-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Skip emulation when driver manages mem_enable If the driver is allowed to enable memory operation itself then it can also turn on HDM decoder support at will. With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when an HDM decoder is not committed, is not needed. Fixes: b777e9bec960 ("cxl/hdm: Emulate HDM decoder from DVSEC range registers") Link: http://lore.kernel.org/r/20230220113657.000042e1@huawei.com Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4474ce56 |
|
14-Feb-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders CXL rev3 spec 8.1.3 RCDs may not have HDM register blocks. Create a fake HDM with information from the CXL PCIe DVSEC registers. The decoder count will be set to the HDM count retrieved from the DVSEC cap register. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b777e9be |
|
14-Feb-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl/hdm: Emulate HDM decoder from DVSEC range registers In the case where HDM decoder register block exists but is not programmed and at the same time the DVSEC range register range is active, populate the CXL decoder object 'cxl_decoder' with info from DVSEC range registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
59c3368b |
|
14-Feb-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl/port: Export cxl_dvsec_rr_decode() to cxl_port Call cxl_dvsec_rr_decode() in the beginning of cxl_port_probe() and preserve the decoded information in a local 'struct cxl_endpoint_dvsec_info'. This info can be passed to various functions later on in order to support the HDM decoder emulation. The invocation of cxl_dvsec_rr_decode() in cxl_hdm_decode_init() is removed and a pointer to the 'struct cxl_endpoint_dvsec_info' is passed in. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640367377.935665.2848747799651019676.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
248529ed |
|
14-Feb-2023 |
Dave Jiang <dave.jiang@intel.com> |
cxl: add RAS status unmasking for CXL By default the CXL RAS mask registers bits are defaulted to 1's and suppress all error reporting. If the kernel has negotiated ownership of error handling for CXL then unmask the mask registers by writing 0s. PCI_EXP_DEVCTL capability is checked to see uncorrectable or correctable errors bits are set before unmasking the respective errors. Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_regs.h Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167639402301.778884.12556849214955646539.stgit@djiang5-mobl3.local Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
09d09e04 |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/dax: Create dax devices for CXL RAM regions While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the remainder and hot-added devices. Platform firmware is also responsible for identifying the platform general purpose memory pool, typically DDR attached DRAM, and arranging for the remainder to be 'Soft Reserved'. That reservation allows the CXL subsystem to route the memory to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated access (device-dax). The new 'struct cxl_dax_region' object allows for a CXL memory resource (region) to be published, but also allow for udev and module policy to act on that event. It also prevents cxl_core.ko from having a module loading dependency on any drivers/dax/ modules. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602003896.1924368.10335442077318970468.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3d8f7cca |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
tools/testing/cxl: Define a fixed volatile configuration to parse Take two endpoints attached to the first switch on the first host-bridge in the cxl_test topology and define a pre-initialized region. This is a x2 interleave underneath a x1 CXL Window. $ modprobe cxl_test $ # cxl list -Ru { "region":"region3", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a32320b7 |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Add region autodiscovery Region autodiscovery is an asynchronous state machine advanced by cxl_port_probe(). After the decoders on an endpoint port are enumerated they are scanned for actively enabled instances. Each active decoder is flagged for auto-assembly CXL_DECODER_F_AUTO and attached to a region. If a region does not already exist for the address range setting of the decoder one is created. That creation process may race with other decoders of the same region being discovered since cxl_port_probe() is asynchronous. A new 'struct cxl_root_decoder' lock, @range_lock, is introduced to mitigate that race. Once all decoders have arrived, "p->nr_targets == p->interleave_ways", they are sorted by their relative decode position. The sort algorithm involves finding the point in the cxl_port topology where one leg of the decode leads to deviceA and the other deviceB. At that point in the topology the target order in the 'struct cxl_switch_decoder' indicates the relative position of those endpoint decoders in the region. >From that point the region goes through the same setup and validation steps as user-created regions, but instead of programming the decoders it validates that driver would have written the same values to the decoders as were already present. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601999958.1924368.9366954455835735048.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7d505f98 |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Add a mode attribute for regions In preparation for a new region type, "ram" regions, add a mode attribute to clarify the mode of the decoders that can be added to a region. Share the internals of mode_show() (for decoders) with the region case. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601993930.1924368.4305018565539515665.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a49aa814 |
|
17-Jan-2023 |
Davidlohr Bueso <dave@stgolabs.net> |
cxl/mem: Wire up event interrupts Currently the only CXL features targeted for irq support require their message numbers to be within the first 16 entries. The device may however support less than 16 entries depending on the support it provides. Attempt to allocate these 16 irq vectors. If the device supports less then the PCI infrastructure will allocate that number. Upon successful allocation, users can plug in their respective isr at any point thereafter. CXL device events are signaled via interrupts. Each event log may have a different interrupt message number. These message numbers are reported in the Get Event Interrupt Policy mailbox command. Add interrupt support for event logs. Interrupts are allocated as shared interrupts. Therefore, all or some event logs can share the same message number. In addition all logs are queried on any interrupt in order of the most to least severe based on the status register. Finally place all event configuration logic into cxl_event_config(). Previously the logic was a simple 'read all' on start up. But interrupts must be configured prior to any reads to ensure no events are missed. A single event configuration function results in a cleaner over all implementation. Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-2-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6ebe28f9 |
|
17-Jan-2023 |
Ira Weiny <ira.weiny@intel.com> |
cxl/mem: Read, trace, and clear events on driver load CXL devices have multiple event logs which can be queried for CXL event records. Devices are required to support the storage of at least one event record in each event log type. Devices track event log overflow by incrementing a counter and tracking the time of the first and last overflow event seen. Software queries events via the Get Event Record mailbox command; CXL rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section 8.2.9.2.3 Clear Event Records mailbox command. If the result of negotiating CXL Error Reporting Control is OS control, read and clear all event logs on driver load. Ensure a clean slate of events by reading and clearing the events on driver load. The status register is not used because a device may continue to trigger events and the only requirement is to empty the log at least once. This allows for the required transition from empty to non-empty for interrupt generation. Handling of interrupts is in a follow on patch. The device can return up to 1MB worth of event records per query. Allocate a shared large buffer to handle the max number of records based on the mailbox payload size. This patch traces a raw event record and leaves specific event record type tracing to subsequent patches. Macros are created to aid in tracing the common CXL Event header fields. Each record is cleared explicitly. A clear all bit is specified but is only valid when the log overflows. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-1-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4a20bc3e |
|
08-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pci: Move tracepoint definitions to drivers/cxl/core/ CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2a81ada3 |
|
10-Jan-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
driver core: make struct bus_type.uevent() take a const * The uevent() callback in struct bus_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
c99b2e8c |
|
05-Dec-2022 |
Dave Jiang <dave.jiang@intel.com> |
cxl: update names for interleave ways conversion macros Change names for interleave ways macros to clearly indicate which variable is encoded and which is the actual ways value. ways == interleave ways eiw == encoded interleave ways Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027516228.3124679.11265039496968588580.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
83351ddb |
|
05-Dec-2022 |
Dave Jiang <dave.jiang@intel.com> |
cxl: update names for interleave granularity conversion macros Change names for granularity macros to clearly indicate which variable is encoded and which is the actual granularity. granularity == interleave granularity eig == encoded interleave granularity Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027493237.3124429.8948852388671827664.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7592d935 |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem tl;dr: Clean up an unnecessary export and enable cxl_test. An RCD (Restricted CXL Device), in contrast to a typical CXL device in a VH topology, obtains its component registers from the bottom half of the associated CXL host bridge RCRB (Root Complex Register Block). In turn this means that cxl_rcrb_to_component() needs to be called from devm_cxl_add_endpoint(). Presently devm_cxl_add_endpoint() is part of the CXL core, but the only user is the CXL mem module. Move it from cxl_core to cxl_mem to not only get rid of an unnecessary export, but to also enable its call out to cxl_rcrb_to_component(), in a subsequent patch, to be mocked by cxl_test. Recall that cxl_test can only mock exported symbols, and since cxl_rcrb_to_component() is itself inside the core, all callers must be outside of cxl_core to allow cxl_test to mock it. Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993045072.1882361.13944923741276843683.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
f9db85bf |
|
30-Nov-2022 |
Alison Schofield <alison.schofield@intel.com> |
cxl/acpi: Support CXL XOR Interleave Math (CXIMS) When the CFMWS is using XOR math, parse the corresponding CXIMS structure and store the xormaps in the root decoder structure. Use the xormaps in a new lookup, cxl_hb_xor(), to find a targets entry in the host bridge interleave target list. Defined in CXL Specfication 3.0 Section: 9.17.1 Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/5794813acdf7b67cfba3609c6aaff46932fa38d0.1669847017.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2905cb52 |
|
29-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pci: Add (hopeful) error handling support Add nominal error handling that tears down CXL.mem in response to error notifications that imply a device reset. Given some CXL.mem may be operating as System RAM, there is a high likelihood that these error events are fatal. However, if the system survives the notification the expectation is that the driver behavior is equivalent to a hot-unplug and re-plug of an endpoint. Note that this does not change the mask values from the default. That awaits CXL _OSC support to determine whether platform firmware is in control of the mask registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974413966.1608150.15522782911404473932.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
bd09626b |
|
29-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pci: Find and map the RAS Capability Structure The RAS Capability Structure has some ancillary information that may be relevant with respect to AER events, link and protcol error status registers. Map the RAS Capability Registers in support of defining a 'struct pci_error_handlers' instance for the cxl_pci driver. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412803.1608150.7096566580400947001.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a1554e9c |
|
29-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pci: Prepare for mapping RAS Capability Structure The RAS Capabilitiy Structure is a CXL Component register capability block. Unlike the HDM Decoder Capability, it will be referenced by the cxl_pci driver in response to PCIe AER events. Due to this it is no longer the case that cxl_map_component_regs() can assume that it should map all component registers. Plumb a bitmask of capability ids to map through cxl_map_component_regs(). For symmetry cxl_probe_device_regs() is updated to populate @id in 'struct cxl_reg_map' even though cxl_map_device_regs() does not have a need to map a subset of the device registers per caller. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412214.1608150.11487843455070795378.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6c7f4f1e |
|
29-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core/regs: Make cxl_map_{component, device}_regs() device generic There is no need to carry the barno and the block offset through the stack, just convert them to a resource base immediately. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411035.1608150.8605988708101648442.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d5b1a271 |
|
03-Dec-2022 |
Robert Richter <rrichter@amd.com> |
cxl/acpi: Extract component registers of restricted hosts from RCRB A downstream port must be connected to a component register block. For restricted hosts the base address is determined from the RCRB. The RCRB is provided by the host's CEDT CHBS entry. Rework CEDT parser to get the RCRB and add code to extract the component register block from it. RCRB's BAR[0..1] point to the component block containing CXL subsystem component registers. MEMBAR extraction follows the PCI base spec here, esp. 64 bit extraction and memory range alignment (6.0, 7.5.1.2.1). The RCRB base address is cached in the cxl_dport per-host bridge so that the upstream port component registers can be retrieved later by an RCD (RCIEP) associated with the host bridge. Note: Right now the component register block is used for HDM decoder capability only which is optional for RCDs. If unsupported by the RCD, the HDM init will fail. It is future work to bypass it in this case. Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/Y4dsGZ24aJlxSfI1@rric.localdomain [djbw: introduce devm_cxl_add_rch_dport()] Link: https://lore.kernel.org/r/166993044524.1882361.2539922887413208807.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d18bc74a |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Manage CPU caches relative to DPA invalidation events A "DPA invalidation event" is any scenario where the contents of a DPA (Device Physical Address) is modified in a way that is incoherent with CPU caches, or if the HPA (Host Physical Address) to DPA association changes due to a remapping event. PMEM security events like Unlock and Passphrase Secure Erase already manage caches through LIBNVDIMM, so that leaves HPA to DPA remap events that need cache management by the CXL core. Those only happen when the boot time CXL configuration has changed. That event occurs when userspace attaches an endpoint decoder to a region configuration, and that region is subsequently activated. The implications of not invalidating caches between remap events is that reads from the region at different points in time may return different results due to stale cached data from the previous HPA to DPA mapping. Without a guarantee that the region contents after cxl_region_probe() are written before being read (a layering-violation assumption that cxl_region_probe() can not make) the CXL subsystem needs to ensure that reads that precede writes see consistent results. A CONFIG_CXL_REGION_INVALIDATION_TEST option is added to support debug and unit testing of the CXL implementation in QEMU or other environments where cpu_cache_has_invalidate_memregion() returns false. This may prove too restrictive for QEMU where the HDM decoders are emulated, but in that case the CXL subsystem needs some new mechanism / indication that the HDM decoder is emulated and not a passthrough of real hardware. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166993222098.1995348.16604163596374520890.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b5807c80 |
|
01-Dec-2022 |
Dave Jiang <dave.jiang@intel.com> |
cxl: add dimm_id support for __nvdimm_create() Set the cxlds->serial as the dimm_id to be fed to __nvdimm_create(). The security code uses that as the key description for the security key of the memory device. The nvdimm unlock code cannot find the respective key without the dimm_id. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166863357043.80269.4337575149671383294.stgit@djiang5-desk3.ch.intel.com Link: https://lore.kernel.org/r/166983620459.2734609.10175456773200251184.stgit@djiang5-desk3.ch.intel.com Link: https://lore.kernel.org/r/166993219918.1995348.10786511454826454601.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4029c32f |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Move rescan to the workqueue Now that the cxl_mem driver has a need to take the root device lock, the cxl_bus_rescan() needs to run outside of the root lock context. That need arises from RCH topologies and the locking that the cxl_mem driver does to attach a descendant to an upstream port. In the RCH case the lock needed is the CXL root device lock [1]. Link: http://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.com [1] Tested-by: Robert Richter <rrichter@amd.com> Link: http://lore.kernel.org/r/166993042884.1882361.5633723613683058881.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
03ff079a |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Remove the cxl_pmem_wq and related infrastructure Now that cxl_nvdimm and cxl_pmem_region objects are torn down sychronously with the removal of either the bridge, or an endpoint, the cxl_pmem_wq infrastructure can be jettisoned. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993042335.1882361.17022872468068436287.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
f17b558d |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Refactor nvdimm device registration, delete the workqueue The three objects 'struct cxl_nvdimm_bridge', 'struct cxl_nvdimm', and 'struct cxl_pmem_region' manage CXL persistent memory resources. The bridge represents base platform resources, the nvdimm represents one or more endpoints, and the region is a collection of nvdimms that contribute to an assembled address range. Their relationship is such that a region is torn down if any component endpoints are removed. All regions and endpoints are torn down if the foundational bridge device goes down. A workqueue was deployed to manage these interdependencies, but it is difficult to reason about, and fragile. A recent attempt to take the CXL root device lock in the cxl_mem driver was reported by lockdep as colliding with the flush_work() in the cxl_pmem flows. Instead of the workqueue, arrange for all pmem/nvdimm devices to be torn down immediately and hierarchically. A similar change is made to both the 'cxl_nvdimm' and 'cxl_pmem_region' objects. For bisect-ability both changes are made in the same patch which unfortunately makes the patch bigger than desired. Arrange for cxl_memdev and cxl_region to register a cxl_nvdimm and cxl_pmem_region as a devres release action of the bridge device. Additionally, include a devres release action of the cxl_memdev or cxl_region device that triggers the bridge's release action if an endpoint exits before the bridge. I.e. this allows either unplugging the bridge, or unplugging and endpoint to result in the same cleanup actions. To keep the patch smaller the cleanup of the now defunct workqueue infrastructure is saved for a follow-on patch. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041773.1882361.16444301376147207609.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
16d53cb0 |
|
01-Dec-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Drop redundant pmem region release handling Now that a cxl_nvdimm object can only experience ->remove() via an unregistration event (because the cxl_nvdimm bind attributes are suppressed), additional cleanups are possible. It is already the case that the removal of a cxl_memdev object triggers ->remove() on any associated region. With that mechanism in place there is no need for the cxl_nvdimm removal to trigger the same. Just rely on cxl_region_detach() to tear down the whole cxl_pmem_region. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041215.1882361.6321535567798911286.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3b39fd6c |
|
29-Aug-2022 |
Adam Manzanares <a.manzanares@samsung.com> |
cxl: Replace HDM decoder granularity magic numbers When reviewing the CFMWS parsing code that deals with the HDM decoders, I noticed a couple of magic numbers. This commit replaces these magic numbers with constants defined by the CXL 3.0 specification. v2: - Change references to CXL 3.0 specification (David) - CXL_DECODER_MAX_GRANULARITY_ORDER -> CXL_DECODER_MAX_ENCODED_IG (Dan) Signed-off-by: Adam Manzanares <a.manzanares@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220829220249.243888-1-a.manzanares@samsung.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
fa89248e |
|
18-Oct-2022 |
Robert Richter <rrichter@amd.com> |
cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() The function devm_cxl_iomap_block() is only used in the core code. There are two declarations in header files of it, in drivers/cxl/core/core.h and drivers/cxl/cxl.h. Remove its unused declaration in drivers/cxl/cxl.h. Fixing build error in regs.c found by kernel test robot by including "core.h" there. Signed-off-by: Robert Richter <rrichter@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20221018132341.76259-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e4f6dfa9 |
|
03-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Fix 'distance' calculation with passthrough ports When programming port decode targets, the algorithm wants to ensure that two devices are compatible to be programmed as peers beneath a given port. A compatible peer is a target that shares the same dport, and where that target's interleave position also routes it to the same dport. Compatibility is determined by the device's interleave position being >= to distance. For example, if a given dport can only map every Nth position then positions less than N away from the last target programmed are incompatible. The @distance for the host-bridge's cxl_port in a simple dual-ported host-bridge configuration with 2 direct-attached devices is 1, i.e. An x2 region divided by 2 dports to reach 2 region targets. An x4 region under an x2 host-bridge would need 2 intervening switches where the @distance at the host bridge level is 2 (x4 region divided by 2 switches to reach 4 devices). However, the distance between peers underneath a single ported host-bridge is always zero because there is no limit to the number of devices that can be mapped. In other words, there are no decoders to program in a passthrough, all descendants are mapped and distance only starts matters for the intervening descendant ports of the passthrough port. Add tracking for the number of dports mapped to a port, and use that to detect the passthrough case for calculating @distance. Cc: <stable@vger.kernel.org> Reported-by: Bobo WL <lmw.bobo@gmail.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4d07ae22 |
|
03-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak When a cxl_nvdimm object goes through a ->remove() event (device physically removed, nvdimm-bridge disabled, or nvdimm device disabled), then any associated regions must also be disabled. As highlighted by the cxl-create-region.sh test [1], a single device may host multiple regions, but the driver was only tracking one region at a time. This leads to a situation where only the last enabled region per nvdimm device is cleaned up properly. Other regions are leaked, and this also causes cxl_memdev reference leaks. Fix the tracking by allowing cxl_nvdimm objects to track multiple region associations. Cc: <stable@vger.kernel.org> Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1] Reported-by: Vishal Verma <vishal.l.verma@intel.com> Fixes: 04ad63f086d1 ("cxl/region: Introduce cxl_pmem_region objects") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
038e6eb8 |
|
04-Aug-2022 |
Bagas Sanjaya <bagasdotme@gmail.com> |
cxl/region: describe targets and nr_targets members of cxl_region_params Sphinx reported undescribed parameters in cxl_region_params struct: ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'targets' not described in 'cxl_region_params' ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'nr_targets' not described in 'cxl_region_params' Describe these members. Fixes: b9686e8c8e39 ("cxl/region: Enable the assignment of endpoint decoders to regions") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220804075448.98241-3-bagasdotme@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e7748305 |
|
22-Jul-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Minimize granularity for x1 interleaves The kernel enforces that region granularity is >= to the top-level interleave-granularity for the given CXL window. However, when the CXL window interleave is x1, i.e. non-interleaved at the host bridge level, then the specified granularity does not matter. Override the window specified granularity to the CXL minimum so that any valid region granularity is >= to the root granularity. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com [djbw: add CXL_DECODER_MIN_GRANULARITY per vishal] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
c7e3548c |
|
01-Aug-2022 |
Dan Carpenter <dan.carpenter@oracle.com> |
cxl/region: prevent underflow in ways_to_cxl() The "ways" variable comes from the user. The ways_to_cxl() function has an upper bound but it doesn't check for negatives. Make the "ways" variable an unsigned int to fix this bug. Fixes: 80d10a6cee05 ("cxl/region: Add interleave geometry attributes") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yueo3NV2hFCXx1iV@kili [djbw: fixup interleave_ways_store() to only accept unsigned input] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
04ad63f0 |
|
11-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Introduce cxl_pmem_region objects The LIBNVDIMM subsystem is a platform agnostic representation of system NVDIMM / persistent memory resources. To date, the CXL subsystem's interaction with LIBNVDIMM has been to register an nvdimm-bridge device and cxl_nvdimm objects to proxy CXL capabilities into existing LIBNVDIMM subsystem mechanics. With regions the approach is the same. Create a new cxl_pmem_region object to proxy CXL region details into a LIBNVDIMM definition. With this enabling LIBNVDIMM can partition CXL persistent memory regions with legacy namespace labels. A follow-on patch will add CXL region label and CXL namespace label support to persist region configurations across driver reload / system-reset events. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784340111.1758207.3036498385188290968.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
99183d26 |
|
10-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge Be careful to only disable cxl_pmem objects related to a given cxl_nvdimm_bridge. Otherwise, offline_nvdimm_bus() reaches across CXL domains and disables more than is expected. Fixes: 21083f51521f ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784339569.1758207.1557084545278004577.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8d48817d |
|
15-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Add region driver boiler plate The CXL region driver is responsible for routing fully formed CXL regions to one of libnvdimm, for persistent memory regions, device-dax for volatile memory regions, or just act as an enumeration placeholder if the region was setup and configuration locked by platform firmware. In the platform-firmware-setup case the expectation is that region is already accounted in the system memory map, i.e. already enabled as "System RAM". For now, just attach to CXL regions in the CXL_CONFIG_COMMIT state, and take no further action. Given this driver is just a small / simple router, include it in the core rather than its own module. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-18-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
176baefb |
|
08-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Commit decoder state to hardware After all the soft validation of the region has completed, convey the region configuration to hardware while being careful to commit decoders in specification mandated order. In addition to programming the endpoint decoder base-address, interleave ways and granularity, the switch decoder target lists are also established. While the kernel can enforce spec-mandated commit order, it can not enforce spec-mandated reset order. For example, the kernel can't stop someone from removing an endpoint device that is occupying decoderN in a switch decoder where decoderN+1 is also committed. To reset decoderN, decoderN+1 must be torn down first. That "tear down the world" implementation is saved for a follow-on patch. Callback operations are provided for the 'commit' and 'reset' operations. While those callbacks may prove useful for CXL accelerators (Type-2 devices with memory) the primary motivation is to enable a simple way for cxl_test to intercept those operations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784338418.1758207.14659830845389904356.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
27b3f8d1 |
|
06-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Program target lists Once the region's interleave geometry (ways, granularity, size) is established and all the endpoint decoder targets are assigned, the next phase is to program all the intermediate decoders. Specifically, each CXL switch in the path between the endpoint and its CXL host-bridge (including the logical switch internal to the host-bridge) needs to have its decoders programmed and the target list order assigned. The difficulty in this implementation lies in determining which endpoint decoder ordering combinations are valid. Consider the cxl_test case of 2 host bridges, each of those host-bridges attached to 2 switches, and each of those switches attached to 2 endpoints for a potential 8-way interleave. The x2 interleave at the host-bridge level requires that all even numbered endpoint decoder positions be located on the "left" hand side of the topology tree, and the odd numbered positions on the other. The endpoints that are peers on the same switch need to have a position that can be routed with a dedicated address bit per-endpoint. See check_last_peer() for the details. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337827.1758207.132121746122685208.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
384e624b |
|
07-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Attach endpoint decoders CXL regions (interleave sets) are made up of a set of memory devices where each device maps a portion of the interleave with one of its decoders (see CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure). As endpoint decoders are identified by a provisioning tool they can be added to a region provided the region interleave properties are set (way, granularity, HPA) and DPA has been assigned to the decoder. The attach event triggers several validation checks, for example: - is the DPA sized appropriately for the region - is the decoder reachable via the host-bridges identified by the region's root decoder - is the device already active in a different region position slot - are there already regions with a higher HPA active on a given port (per CXL 2.0 8.2.5.12.20 Committing Decoder Programming) ...and the attach event affords an opportunity to collect data and resources relevant to later programming the target lists in switch decoders, for example: - allocate a decoder at each cxl_port in the decode chain - for a given switch port, how many the region's endpoints are hosted through the port - how many unique targets (next hops) does a port need to map to reach those endpoints The act of reconciling this information and deploying it to the decoder configuration is saved for a follow-on patch. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337277.1758207.4108508181328528703.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6aa41144 |
|
06-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Add a host-bridge index lookup mechanism The ACPI CXL Fixed Memory Window Structure (CFMWS) defines multiple methods to determine which host bridge provides access to a given endpoint relative to that device's position in the interleave. The "Interleave Arithmetic" defines either a "standard modulo" / round-random algorithm, or "xormap" based algorithm which can be defined as a non-linear transform. Given that there are already more options beyond "standard modulo" and that "xormap" may turn out to be ACPI CXL specific, provide a callback for the region provisioning code to map endpoint positions back to expected host bridge id (cxl_dport target). For now just support the simple modulo math case and save the xormap for a follow-on change. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-14-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b9686e8c |
|
04-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Enable the assignment of endpoint decoders to regions The region provisioning process involves allocating DPA to a set of endpoint decoders, and HPA plus the region geometry to a region device. Then the decoder is assigned to the region. At this point several validation steps can be performed to validate that the decoder is suitable to participate in the region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784336184.1758207.16403282029203949622.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
23a22cd1 |
|
25-Apr-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/region: Allocate HPA capacity to regions After a region's interleave parameters (ways and granularity) are set, add a way for regions to allocate HPA (host physical address space) from the free capacity in their parent root-decoder. The allocator for this capacity reuses the 'struct resource' based allocator used for CONFIG_DEVICE_PRIVATE. Once the tuple of "ways, granularity, [uuid], and size" is set the region configuration transitions to the CXL_CONFIG_INTERLEAVE_ACTIVE state which is a precursor to allowing endpoint decoders to be added to a region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784335630.1758207.420216490941955417.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
80d10a6c |
|
25-Apr-2022 |
Ben Widawsky <bwidawsk@kernel.org> |
cxl/region: Add interleave geometry attributes Add ABI to allow the number of devices that comprise a region to be set as well as the interleave granularity for the region. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-11-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
dd5ba0eb |
|
27-May-2021 |
Ben Widawsky <bwidawsk@kernel.org> |
cxl/region: Add a 'uuid' attribute The process of provisioning a region involves triggering the creation of a new region object, pouring in the configuration, and then binding that configured object to the region driver to start its operation. For persistent memory regions the CXL specification mandates that it identified by a uuid. Add an ABI for userspace to specify a region's uuid. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: simplify locking] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784334465.1758207.8224025435884752570.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
779dd20c |
|
08-Jun-2021 |
Ben Widawsky <bwidawsk@kernel.org> |
cxl/region: Add region creation support CXL 2.0 allows for dynamic provisioning of new memory regions (system physical address resources like "System RAM" and "Persistent Memory"). Whereas DDR and PMEM resources are conveyed statically at boot, CXL allows for assembling and instantiating new regions from the available capacity of CXL memory expanders in the system. Sysfs with an "echo $region_name > $create_region_attribute" interface is chosen as the mechanism to initiate the provisioning process. This was chosen over ioctl() and netlink() to keep the configuration interface entirely in a pseudo-fs interface, and it was chosen over configfs since, aside from this one creation event, the interface is read-mostly. I.e. configfs supports cases where an object is designed to be provisioned each boot, like an iSCSI storage target, and CXL region creation is mostly for PMEM regions which are created usually once per-lifetime of a server instance. This is an improvement over nvdimm that pre-created "seed" devices that tended to confuse users looking to determine which devices are active and which are idle. Recall that the major change that CXL brings over previous persistent memory architectures is the ability to dynamically define new regions. Compare that to drivers like 'nfit' where the region configuration is statically defined by platform firmware. Regions are created as a child of a root decoder that encompasses an address space with constraints. When created through sysfs, the root decoder is explicit. When created from an LSA's region structure a root decoder will possibly need to be inferred by the driver. Upon region creation through sysfs, a vacant region is created with a unique name. Regions have a number of attributes that must be configured before the region can be bound to the driver where HDM decoder program is completed. An example of creating a new region: - Allocate a new region name: region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) - Create a new region by name: while region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) ! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region do true; done - Region now exists in sysfs: stat -t /sys/bus/cxl/devices/decoder0.0/$region - Delete the region, and name: echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com [djbw: simplify locking, reword changelog] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7f8faf96 |
|
07-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Enumerate port targets before adding endpoints The port scanning algorithm in devm_cxl_enumerate_ports() walks up the topology and adds cxl_port objects starting from the root down to the endpoint. When those ports are initially created they know all their dports, but they do not know the downstream cxl_port instance that represents the next descendant in the topology. Rework create_endpoint() into devm_cxl_add_endpoint() that enumerates the downstream cxl_port topology into each port's 'struct cxl_ep' record for each endpoint it that the port is an ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
39178585 |
|
27-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Move dport tracking to an xarray Reduce the complexity and the overhead of walking the topology to determine endpoint connectivity to root decoder interleave configurations. Note that cxl_detach_ep(), after it determines that the last @ep has departed and decides to delete the port, now needs to walk the dport array with the device_lock() held to remove entries. Previously list_splice_init() could be used atomically delete all dport entries at once and then perform entry tear down outside the lock. There is no list_splice_init() equivalent for the xarray. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784331647.1758207.6345820282285119339.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
256d0e9e |
|
27-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Move 'cxl_ep' references to an xarray per port In preparation for region provisioning that needs to walk the topology by endpoints, use an xarray to record endpoint interest in a given port. In addition to being more space and time efficient it also reduces the complexity of the implementation by moving locking internal to the xarray implementation. It also allows for a single cxl_ep reference to be recorded in multiple xarrays. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-2-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
1b58b4ca |
|
27-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Record parent dport when adding ports At the time that cxl_port instances are being created, cache the dport from the parent port that points to this new child port. This will be useful for region provisioning when walking the tree to calculate decoder targets, and saves rewalking the dport list after the fact to build this information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-1-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
de516b40 |
|
27-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Record dport in endpoint references Recall that the primary role of the cxl_mem driver is to probe if the given endpoint is connected to a CXL port topology. In that process it walks its device ancestry to its PCI root port. If that root port is also a CXL root port then the probe process adds cxl_port object instances at switch in the path between to the root and the endpoint. As those cxl_port instances are added, or if a previous enumeration attempt already created the port, a 'struct cxl_ep' instance is registered with that port to track the endpoints interested in that port. At the time the cxl_ep is registered the downstream egress path from the port to the endpoint is known. Take the opportunity to record that information as it will be needed for dynamic programming of decoder targets during region provisioning. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329944.1758207.15203961796832072116.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
0c33b393 |
|
24-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Track next decoder to allocate The CXL specification enforces that endpoint decoders are committed in hw instance id order. In preparation for adding dynamic DPA allocation, record the hw instance id in endpoint decoders, and enforce allocations to occur in hw instance id order. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784328827.1758207.9627538529944559954.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2c866903 |
|
23-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Add 'mode' attribute to decoder objects Recall that the Device Physical Address (DPA) space of a CXL Memory Expander is potentially partitioned into a volatile and persistent portion. A decoder maps a Host Physical Address (HPA) range to a DPA range and that translation depends on the value of all previous (lower instance number) decoders before the current one. In preparation for allowing dynamic provisioning of regions, decoders need an ABI to indicate which DPA partition a decoder targets. This ABI needs to be prepared for the possibility that some other agent committed and locked a decoder that spans the partition boundary. Add 'decoderX.Y/mode' to endpoint decoders that indicates which partition 'ram' / 'pmem' the decoder targets, or 'mixed' if the decoder currently spans the partition boundary. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603881967.551046.6007594190951596439.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
9c57cde0 |
|
21-Jul-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/hdm: Enumerate allocated DPA In preparation for provisioning CXL regions, add accounting for the DPA space consumed by existing regions / decoders. Recall, a CXL region is a memory range comprised from one or more endpoint devices contributing a mapping of their DPA into HPA space through a decoder. Record the DPA ranges covered by committed decoders at initial probe of endpoint ports relative to a per-device resource tree of the DPA type (pmem or volatile-ram). The cxl_dpa_rwsem semaphore is introduced to globally synchronize DPA state across all endpoints and their decoders at once. The vast majority of DPA operations are reads as region creation is expected to be as rare as disk partitioning and volume creation. The device_lock() for this synchronization is specifically avoided for concern of entangling with sysfs attribute removal. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327682.1758207.7914919426043855876.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3bf65915 |
|
21-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Define a 'struct cxl_endpoint_decoder' Previously the target routing specifics of switch decoders and platform CXL window resource tracking of root decoders were factored out of 'struct cxl_decoder'. While switch decoders translate from SPA to downstream ports, endpoint decoders translate from SPA to DPA. This patch, 3 of 3, adds a 'struct cxl_endpoint_decoder' that tracks an endpoint-specific Device Physical Address (DPA) resource. For now this just defines ->dpa_res, a follow-on patch will handle requesting DPA resource ranges from a device-DPA resource tree. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327088.1758207.15502834501671201192.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
0f157c7f |
|
12-Jul-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Define a 'struct cxl_root_decoder' Previously the target routing specifics of switch decoders were factored out of 'struct cxl_decoder' into 'struct cxl_switch_decoder'. This patch, 2 of 3, adds a 'struct cxl_root_decoder' as a superset of a switch decoder that also track the associated CXL window platform resource. Note that the reason the resource for a given root decoder needs to be looked up after the fact (i.e. after cxl_parse_cfmws() and add_cxl_resource()) is because add_cxl_resource() may have merged CXL windows in order to keep them at the top of the resource tree / decode hierarchy. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784326541.1758207.9915663937394448341.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e636479e |
|
18-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Define a 'struct cxl_switch_decoder' Currently 'struct cxl_decoder' contains the superset of attributes needed for all decoder types. Before more type-specific attributes are added to the common definition, reorganize 'struct cxl_decoder' into type specific objects. This patch, the first of three, factors out a cxl_switch_decoder type. See the new kdoc for what a 'struct cxl_switch_decoder' represents in a CXL topology. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784325340.1758207.5064717153608954960.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
c9700604 |
|
19-Jul-2022 |
Ira Weiny <ira.weiny@intel.com> |
cxl/port: Read CDAT table The per-device CDAT data provides performance data that is relevant for mapping which CXL devices can participate in which CXL ranges by QTG (QoS Throttling Group) (per ECN: CXL 2.0 CEDT CFMWS & QTG_DSM) [1]. The QTG association specified in the ECN is advisory. Until the cxl_acpi driver grows support for invoking the QTG _DSM method the CDAT data is only of interest to userspace that may need it for debug purposes. Search the DOE mailboxes available, query CDAT data, cache the data and make it available via a sysfs binary attribute per endpoint at: /sys/bus/cxl/devices/endpointX/CDAT ...similar to other ACPI-structured table data in /sys/firmware/ACPI/tables. The CDAT is relative to 'struct cxl_port' objects since switches in addition to endpoints can host a CDAT instance. Switch CDAT support is not implemented. This does not support table updates at runtime. It will always provide whatever was there when first cached. It is also the case that table updates are not expected outside of explicit DPA address map affecting commands like Set Partition with the immediate flag set. Given that the driver does not support Set Partition with the immediate flag set there is no current need for update support. Link: https://www.computeexpresslink.org/spec-landing [1] Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: drop in-kernel parsing infra for now, and other minor fixups] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-7-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b060edfd |
|
23-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Delete unused nvdimm attribute While there is a need to go from a LIBNVDIMM 'struct nvdimm' to a CXL 'struct cxl_nvdimm', there is no use case to go the other direction. Likely this is a leftover from an early version of the referenced commit before it implemented devm for releasing the created nvdimm. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-19-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
ee800010 |
|
01-Jun-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Cache CXL host bridge data Region creation has need for checking host-bridge connectivity when adding endpoints to regions. Record, at port creation time, the host-bridge to provide a useful shortcut from any location in the topology to the most-significant ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-4-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
419af595 |
|
22-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl: Introduce cxl_to_{ways,granularity} Interleave granularity and ways have CXL specification defined encodings. Promote the conversion helpers to a common header, and use them to replace other open-coded instances. Force caller to consider the error case of the conversion similarly to other conversion helpers like kstrto*(). Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603875016.551046.17236943065932132355.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
885d3bed |
|
18-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Drop is_cxl_decoder() This helper was only used to identify the object type for lockdep purposes. Now that lockdep support is done with explicit lock classes, this helper can be dropped. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603874340.551046.15491766127759244728.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e50fe01e |
|
18-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Drop ->platform_res attribute for root decoders Root decoders are responsible for hosting the available host address space for endpoints and regions to claim. The tracking of that available capacity can be done in iomem_resource directly. As a result, root decoders no longer need to host their own resource tree. The current ->platform_res attribute was added prematurely. Otherwise, ->hpa_range fills the role of conveying the current decode range of the decoder. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603873619.551046.791596854070136223.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e8b7ea58 |
|
18-May-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Rename ->decoder_range ->hpa_range In preparation for growing a ->dpa_range attribute for endpoint decoders, rename the current ->decoder_range to the more descriptive ->hpa_range. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872867.551046.2170426227407458814.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8ae3cebc |
|
04-Mar-2022 |
Ben Widawsky <bwidawsk@kernel.org> |
cxl/core: Use is_endpoint_decoder Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since, by this point, cxl_endpoint_decoder_alloc() has a specified type. Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
38a34e10 |
|
21-Apr-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl: Drop cxl_device_lock() Now that all CXL subsystem locking is validated with custom lock classes, there is no need for the custom usage of the lockdep_mutex. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
9b71e1c9 |
|
02-Feb-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/core/port: Add endpoint decoders Recall that a CXL Port is any object that publishes a CXL HDM Decoder Capability structure. That is Host Bridge and Switches that have been enabled so far. Now, add decoder support to the 'endpoint' CXL Ports registered by the cxl_mem driver. They mostly share the same enumeration as Bridges and Switches, but witout a target list. The target of endpoint decode is device-internal DPA space, not another downstream port. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, hookup enumeration in the port driver] Link: https://lore.kernel.org/r/164386092069.765089.14895687988217608642.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8dd2bc0f |
|
04-Feb-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/mem: Add the cxl_mem driver At this point the subsystem can enumerate all CXL ports (CXL.mem decode resources in upstream switch ports and host bridges) in a system. The last mile is connecting those ports to endpoints. The cxl_mem driver connects an endpoint device to the platform CXL.mem protoctol decode-topology. At ->probe() time it walks its device-topology-ancestry and adds a CXL Port object at every Upstream Port hop until it gets to CXL root. The CXL root object is only present after a platform firmware driver registers platform CXL resources. For ACPI based platform this is managed by the ACPI0017 device and the cxl_acpi driver. The ports are registered such that disabling a given port automatically unregisters all descendant ports, and the chain can only be registered after the root is established. Given ACPI device scanning may run asynchronously compared to PCI device scanning the root driver is tasked with rescanning the bus after the root successfully probes. Conversely if any ports in a chain between the root and an endpoint becomes disconnected it subsequently triggers the endpoint to unregister. Given lock depenedencies the endpoint unregistration happens in a workqueue asynchronously. If userspace cares about synchronizing delayed work after port events the /sys/bus/cxl/flush attribute is available for that purpose. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, rework hotplug support] Link: https://lore.kernel.org/r/164398782997.903003.9725273241627693186.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2703c16c |
|
04-Feb-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core/port: Add switch port enumeration So far the platorm level CXL resources have been enumerated by the cxl_acpi driver, and cxl_pci has gathered all the pre-requisite information it needs to fire up a cxl_mem driver. However, the first thing the cxl_mem driver will be tasked to do is validate that all the PCIe Switches in its ancestry also have CXL capabilities and an CXL.mem link established. Provide a common mechanism for a CXL.mem endpoint driver to enumerate all the ancestor CXL ports in the topology and validate CXL.mem connectivity. Multiple endpoints may end up racing to establish a shared port in the topology. This race is resolved via taking the device-lock on a parent CXL Port before establishing a new child. The winner of the race establishes the port, the loser simply registers its interest in the port via 'struct cxl_ep' place-holder reference. At endpoint teardown the same parent port lock is taken as 'struct cxl_ep' references are deleted. Last endpoint to drop its reference unregisters the port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164398731146.902644.1029761300481366248.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
664bf115 |
|
01-Feb-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core/port: Remove @host argument for dport + decoder enumeration Now that dport and decoder enumeration is centralized in the port driver, the @host argument for these helpers can be made implicit. For the root port the host is the port's uport device (ACPI0017 for cxl_acpi), and for all other descendant ports the devm context is the parent of @port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164375043390.484143.17617734732003230076.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
54cdbf84 |
|
01-Feb-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/port: Add a driver for 'struct cxl_port' objects The need for a CXL port driver and a dedicated cxl_bus_type is driven by a need to simultaneously support 2 independent physical memory decode domains (cache coherent CXL.mem and uncached PCI.mmio) that also intersect at a single PCIe device node. A CXL Port is a device that advertises a CXL Component Register block with an "HDM Decoder Capability Structure". >From Documentation/driver-api/cxl/memory-devices.rst: Similar to how a RAID driver takes disk objects and assembles them into a new logical device, the CXL subsystem is tasked to take PCIe and ACPI objects and assemble them into a CXL.mem decode topology. The need for runtime configuration of the CXL.mem topology is also similar to RAID in that different environments with the same hardware configuration may decide to assemble the topology in contrasting ways. One may choose performance (RAID0) striping memory across multiple Host Bridges and endpoints while another may opt for fault tolerance and disable any striping in the CXL.mem topology. The port driver identifies whether an endpoint Memory Expander is connected to a CXL topology. If an active (bound to the 'cxl_port' driver) CXL Port is not found at every PCIe Switch Upstream port and an active "root" CXL Port then the device is just a plain PCIe endpoint only capable of participating in PCI.mmio and DMA cycles, not CXL.mem coherent interleave sets. The 'cxl_port' driver lets the CXL subsystem leverage driver-core infrastructure for setup and teardown of register resources and communicating device activation status to userspace. The cxl_bus_type can rendezvous the async arrival of platform level CXL resources (via the 'cxl_acpi' driver) with the asynchronous enumeration of Memory Expander endpoints, while also implementing a hierarchical locking model independent of the associated 'struct pci_dev' locking model. The locking for dport and decoder enumeration is now handled in the core rather than callers. For now the port driver only enumerates and registers CXL resources (downstream port metadata and decoder resources) later it will be used to take action on its decoders in response to CXL.mem region provisioning requests. Note1: cxlpci.h has long depended on pci.h, but port.c was the first to not include pci.h. Carry that dependency in cxlpci.h. Note2: cxl port enumeration and probing complicates CXL subsystem init to the point that it helps to have centralized debug logging of probe events in cxl_bus_probe(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/164374948116.464348.1772618057599155408.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d17d0540 |
|
01-Feb-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core/hdm: Add CXL standard decoder enumeration to the core Unlike the decoder enumeration for "root decoders" described by platform firmware, standard decoders can be enumerated from the component registers space once the base address has been identified (via PCI, ACPI, or another mechanism). Add common infrastructure for HDM (Host-managed-Device-Memory) Decoder enumeration and share it between host-bridge, upstream switch port, and cxl_test defined decoders. The locking model for switch level decoders is to hold the port lock over the enumeration. This facilitates moving the dport and decoder enumeration to a 'port' driver. For now, the only enumerator of decoder resources is the cxl_acpi root driver. Co-developed-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164374688404.395335.9239248252443123526.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
98d2d3a2 |
|
31-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Generalize dport enumeration in the core The core houses infrastructure for decoder resources. A CXL port's dports are more closely related to decoder infrastructure than topology enumeration. Implement generic PCI based dport enumeration in the core, i.e. arrange for existing root port enumeration from cxl_acpi to share code with switch port enumeration which just amounts to a small difference in a pci_walk_bus() invocation once the appropriate 'struct pci_bus' has been retrieved. Set the convention that decoder objects are registered after all dports are enumerated. This enables userspace to know when the CXL core is finished establishing 'dportX' links underneath the 'portX' object. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164368114191.354031.5270501846455462665.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a46cfc0f |
|
31-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Introduce a find_cxl_root() helper In preparation for switch port enumeration while also preserving the potential for multi-domain / multi-root CXL topologies. Introduce a 'struct device' generic mechanism for retrieving a root CXL port, if one is registered. Note that the only known multi-domain CXL configurations are running the cxl_test unit test on a system that also publishes an ACPI0017 device. With this in hand the nvdimm-bridge lookup can be with device_find_child() instead of bus_find_device() + custom mocked lookup infrastructure in cxl_test. The mechanism looks for a 2nd level port since the root level topology is platform-firmware specific and the 2nd level down follows standard PCIe topology expectations. The cxl_acpi 2nd level is associated with a PCIe Root Port. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367562182.225521.9488555616768096049.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5ff7316f |
|
31-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Introduce cxl_port_to_pci_bus() Add a helper for converting a PCI enumerated cxl_port into the pci_bus that hosts its dports. For switch ports this is trivial, but for root ports there is no generic way to go from a platform defined host bridge device, like ACPI0016 to its corresponding pci_bus. Rather than spill ACPI goop outside of the cxl_acpi driver, just arrange for it to register an xarray translation from the uport device to the corresponding pci_bus. This is in preparation for centralizing dport enumeration in the core. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164364745633.85488.9744017377155103992.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
86c8ea0f |
|
31-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core/port: Use dedicated lock for decoder target list Lockdep reports: ====================================================== WARNING: possible circular locking dependency detected 5.16.0-rc1+ #142 Tainted: G OE ------------------------------------------------------ cxl/1220 is trying to acquire lock: ffff979b85475460 (kn->active#144){++++}-{0:0}, at: __kernfs_remove+0x1ab/0x1e0 but task is already holding lock: ffff979b87ab38e8 (&dev->lockdep_mutex#2/4){+.+.}-{3:3}, at: cxl_remove_ep+0x50c/0x5c0 [cxl_core] ...where cxl_remove_ep() is a helper that wants to delete ports while holding a lock on the host device for that port. That sets up a lockdep violation whereby target_list_show() can not rely holding the decoder's device lock while walking the target_list. Switch to a dedicated seqlock for this purpose. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367209095.208169.1171673319121271280.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3c5b9039 |
|
31-Jan-2022 |
Dan Williams <dan.j.williams@intel.com> |
cxl: Prove CXL locking When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets an additional mutex that is not clobbered by lockdep_set_novalidate_class() like the typical device_lock(). This allows for local annotation of subsystem locks with mutex_lock_nested() per the subsystem's object/lock hierarchy. For CXL, this primarily needs the ability to lock ports by depth and child objects of ports by their parent parent-port lock. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
53fa1bff |
|
23-Jan-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/core: Track port depth In preparation for proving CXL subsystem usage of the device_lock() order track the depth of ports with the expectation that shallower port locks can be held over deeper port locks. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298419321.3018233.4469731547378993606.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d54c1bbe |
|
31-Jan-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/core/port: Clarify decoder creation Add wrappers for the creation of decoder objects at the root level and switch level, and keep the core helper private to cxl/core/port.c. Root decoders are static descriptors conveyed from platform firmware (e.g. ACPI CFMWS). Switch decoders are CXL standard decoders enumerated via the HDM decoder capability structure. The base address for the HDM decoder capability structure may be conveyed either by PCIe or platform firmware (ACPI CEDT.CHBS). Additionally, the kdoc descriptions for these helpers and their dependencies is updated. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: fixup changelog, clarify kdoc] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164366463014.111117.9714595404002687111.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
608135db |
|
23-Jan-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/core: Convert decoder range to resource CXL decoders manage address ranges in a hierarchical fashion whereby a leaf is a unique subregion of its parent decoder (midlevel or root). It therefore makes sense to use the resource API for handling this. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v1) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298417191.3018233.5201055578165414714.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
c57cae78 |
|
23-Jan-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl: Introduce module_cxl_driver Many CXL drivers simply want to register and unregister themselves. module_driver already supported this. A simple wrapper around that reduces a decent amount of boilerplate in upcoming patches. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298415591.3018233.13608495220547681412.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
303ebc1b |
|
23-Jan-2022 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/acpi: Map component registers for Root Ports This implements the TODO in cxl_acpi for mapping component registers. cxl_acpi becomes the second consumer of CXL register block enumeration (cxl_pci being the first). Moving the functionality to cxl_core allows both of these drivers to use the functionality. Equally importantly it allows cxl_core to use the functionality in the future. CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main distinction being they're a part of the CXL 2.0 host bridge. While mapping their component registers is not immediately useful for the CXL drivers, the movement of register block enumeration into core is a vital step towards HDM decoder programming. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: fix cxl_regmap_to_base() failure cases] Link: https://lore.kernel.org/r/164298415080.3018233.14694957480228676592.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
be185c29 |
|
10-Dec-2021 |
Nathan Chancellor <nathan@kernel.org> |
cxl/core: Remove cxld_const_init in cxl_decoder_alloc() Commit 48667f676189 ("cxl/core: Split decoder setup into alloc + add") aimed to fix a large stack frame warning but from v5 to v6, it introduced a new instance of the warning due to allocating cxld_const_init on the stack, which was done due to the use of const on the nr_target member of the cxl_decoder struct. With ARCH=arm allmodconfig minus CONFIG_KASAN: GCC 11.2.0: drivers/cxl/core/bus.c: In function ‘cxl_decoder_alloc’: drivers/cxl/core/bus.c:523:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] 523 | } | ^ cc1: all warnings being treated as errors Clang 12.0.1: drivers/cxl/core/bus.c:486:21: error: stack frame size of 1056 bytes in function 'cxl_decoder_alloc' [-Werror,-Wframe-larger-than=] struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets) ^ 1 error generated. Revert that part of the change, which makes the stack frame of cxl_decoder_alloc() much more reasonable. Fixes: 48667f676189 ("cxl/core: Split decoder setup into alloc + add") Link: https://github.com/ClangBuiltLinux/linux/issues/1539 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211210213627.2477370-1-nathan@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
53989fad |
|
11-Nov-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Fix module reload vs workqueue state A test of the form: while true; do modprobe -r cxl_pmem; modprobe cxl_pmem; done May lead to a crash signature of the form: BUG: unable to handle page fault for address: ffffffffc0660030 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page [..] Workqueue: cxl_pmem 0xffffffffc0660030 RIP: 0010:0xffffffffc0660030 Code: Unable to access opcode bytes at RIP 0xffffffffc0660006. [..] Call Trace: ? process_one_work+0x4ec/0x9c0 ? pwq_dec_nr_in_flight+0x100/0x100 ? rwlock_bug.part.0+0x60/0x60 ? worker_thread+0x2eb/0x700 In that report the 0xffffffffc0660030 address corresponds to the former function address of cxl_nvb_update_state() from a previous load of the module, not the current address. Fix that by arranging for ->state_work in the 'struct cxl_nvdimm_bridge' object to be reinitialized on cxl_pmem module reload. Details: Recall that CXL subsystem wants to link a CXL memory expander device to an NVDIMM sub-hierarchy when both a persistent memory range has been registered by the CXL platform driver (cxl_acpi) *and* when that CXL memory expander has published persistent memory capacity (Get Partition Info). To this end the cxl_nvdimm_bridge driver arranges to rescan the CXL bus when either of those conditions change. The helper bus_rescan_devices() can not be called underneath the device_lock() for any device on that bus, so the cxl_nvdimm_bridge driver uses a workqueue for the rescan. Typically a driver allocates driver data to hold a 'struct work_struct' for a driven device, but for a workqueue that may run after ->remove() returns, driver data will have been freed. The 'struct cxl_nvdimm_bridge' object holds the state and work_struct directly. Unfortunately it was only arranging for that infrastructure to be initialized once per device creation rather than the necessary once per workqueue (cxl_pmem_wq) creation. Introduce is_cxl_nvdimm_bridge() and cxl_nvdimm_bridge_reset() in support of invalidating stale references to a recently destroyed cxl_pmem_wq. Cc: <stable@vger.kernel.org> Fixes: 8fdcb1704f61 ("cxl/pmem: Add initial infrastructure for pmem support") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/163665474585.3505991.8397182770066720755.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a261e9a1 |
|
15-Oct-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pci: Add @base to cxl_register_map In addition to carrying @barno, @block_offset, and @reg_type, add @base to keep all map/unmap parameters in one object. The helpers cxl_{map,unmap}_regblock() handle adjusting @base to the @block_offset at map and unmap time. Document that @base incorporates @block_offset so that downstream consumers of a mapped cxl_register_map instance do not need perform any fixups / can use @base directly. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163433497228.889435.11271988238496181536.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
48667f67 |
|
21-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Split decoder setup into alloc + add The kbuild robot reports: drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds limit (1024) in function 'devm_cxl_add_decoder' It is also the case the devm_cxl_add_decoder() is unwieldy to use for all the different decoder types. Fix the stack usage by splitting the creation into alloc and add steps. This also allows for context specific construction before adding. With the split the caller is responsible for registering a devm callback to trigger device_unregister() for the decoder rather than it being implicit in the decoder registration. I.e. the routine that calls alloc is responsible for calling put_device() if the "add" operation fails. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163225205828.3038145.6831131648369404859.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7d3eb23c |
|
08-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
tools/testing/cxl: Introduce a mock memory device + driver Introduce an emulated device-set plus driver to register CXL memory devices, 'struct cxl_memdev' instances, in the mock cxl_test topology. This enables the development of HDM Decoder (Host-managed Device Memory Decoder) programming flow (region provisioning) in an environment that can be updated alongside the kernel as it gains more functionality. Whereas the cxl_pci module looks for CXL memory expanders on the 'pci' bus, the cxl_mock_mem module attaches to CXL expanders on the platform bus emitted by cxl_test. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116440099.2460985.10692549614409346604.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
a5c25802 |
|
08-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/bus: Populate the target list at decoder create As found by cxl_test, the implementation populated the target_list for the single dport exceptional case, it missed populating the target_list for the typical multi-dport case. Root decoders always know their target list at the beginning of time, and even switch-level decoders should have a target list of one or more zeros by default, depending on the interleave-ways setting. Walk the hosting port's dport list and populate based on the passed in map. Move devm_cxl_add_passthrough_decoder() out of line now that it does the work of generating a target_map. Before: $ cat /sys/bus/cxl/devices/root2/decoder*/target_list 0 0 After: $ cat /sys/bus/cxl/devices/root2/decoder*/target_list 0 0,1,2,3 0 0,1,2,3 Where root2 is a CXL topology root object generated by 'cxl_test'. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116439000.2460985.11713777051267946018.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
67dcdd4d |
|
14-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Create an environment for CXL plumbing unit tests. Especially when it comes to an algorithm for HDM Decoder (Host-managed Device Memory Decoder) programming, the availability of an in-kernel-tree emulation environment for CXL configuration complexity and corner cases speeds development and deters regressions. The approach taken mirrors what was done for tools/testing/nvdimm/. I.e. an external module, cxl_test.ko built out of the tools/testing/cxl/ directory, provides mock implementations of kernel APIs and kernel objects to simulate a real world device hierarchy. One feedback for the tools/testing/nvdimm/ proposal was "why not do this in QEMU?". In fact, the CXL development community has developed a QEMU model for CXL [1]. However, there are a few blocking issues that keep QEMU from being a tight fit for topology + provisioning unit tests: 1/ The QEMU community has yet to show interest in merging any of this support that has had patches on the list since November 2020. So, testing CXL to date involves building custom QEMU with out-of-tree patches. 2/ CXL mechanisms like cross-host-bridge interleave do not have a clear path to be emulated by QEMU without major infrastructure work. This is easier to achieve with the alloc_mock_res() approach taken in this patch to shortcut-define emulated system physical address ranges with interleave behavior. The QEMU enabling has been critical to get the driver off the ground, and may still move forward, but it does not address the ongoing needs of a regression testing environment and test driven development. This patch adds an ACPI CXL Platform definition with emulated CXL multi-ported host-bridges. A follow on patch adds emulated memory expander devices. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1] Link: https://lore.kernel.org/r/163164680798.2831381.838684634806668012.stgit@dwillia2-desk3.amr.corp.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2e52b625 |
|
14-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Add support for multiple nvdimm-bridge objects In preparation for a mocked unit test environment for CXL objects, allow for multiple unique nvdimm-bridge objects. For now, just allow multiple bridges to be registered. Later, when there are multiple present, further updates are needed to cxl_find_nvdimm_bridge() to identify which bridge is associated with which CXL hierarchy for nvdimm registration. Note that this does change the kernel device-name for the bridge object. User space should not have any attachment to the device name at this point as it is still early days in the CXL driver development. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163164647007.2831228.2150246954620721526.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
301e68dd |
|
30-Jul-2021 |
Kees Cook <keescook@chromium.org> |
cxl/core: Replace unions with struct_group() Use the newly introduced struct_group_typed() macro to clean up the declaration of struct cxl_regs. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: linux-cxl@vger.kernel.org Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
5b68705d |
|
16-Jul-2021 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/pci: Simplify register setup It is desirable to retain the mappings from the calling function. By simplifying this code, it will be much more straightforward to do that. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210716231548.174778-3-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
21083f51 |
|
15-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Register 'pmem' / cxl_nvdimm devices While a memX device on /sys/bus/cxl represents a CXL memory expander control interface, a pmemX device represents the persistent memory sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX control interface. With this skeleton ndctl can now see persistent memory devices on a "CXL" bus. Later patches add support for translating libnvdimm native commands to CXL commands. # ndctl list -BDiu -b CXL { "provider":"CXL", "dev":"ndbus1", "dimms":[ { "dev":"nmem1", "state":"disabled" }, { "dev":"nmem0", "state":"disabled" } ] } Given nvdimm_bus_unregister() removes all devices on an ndbus0 the cxl_pmem infrastructure needs to arrange ->remove() to be triggered on cxl_nvdimm devices to keep their enabled state synchronized with the registration state of their corresponding device on the nvdimm_bus. In other words, always arrange for cxl_nvdimm_driver.remove() to unregister nvdimms from an nvdimm_bus ahead of the bus being unregistered. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162380012696.3039556.4293801691038740850.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8fdcb170 |
|
15-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/pmem: Add initial infrastructure for pmem support Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm bus hierarchy. Also, flesh out the cxl_bus definition to allow a cxl_nvdimm_bridge_driver to attach to the bridge and trigger the nvdimm-bus registration. The creation of the bridge is gated on the detection of a PMEM capable address space registered to the root. The bridge indirection allows the libnvdimm module to remain unloaded on platforms without PMEM support. Given that the probing of ACPI0017 is asynchronous to CXL endpoint devices, and the expectation that CXL endpoint devices register other PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The workqueue is needed to run bus_rescan_devices() outside of the device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources as they arrive. For now only the bus is taken online/offline in the workqueue. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162379909706.2993820.14051258608641140169.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6af7139c |
|
15-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Add cxl-bus driver infrastructure Enable devices on the 'cxl' bus to be attached to drivers. The initial user of this functionality is a driver for an 'nvdimm-bridge' device that anchors a libnvdimm hierarchy attached to CXL persistent memory resources. Other device types that will leverage this include: cxl_port: map and use component register functionality (HDM Decoders) cxl_nvdimm: translate CXL memory expander endpoints to libnvdimm 'nvdimm' objects cxl_region: translate CXL interleave sets to libnvdimm 'region' objects The pairing of devices to drivers is handled through the cxl_device_id() matching to cxl_driver.id values. A cxl_device_id() of '0' indicates no driver support. In addition to ->match(), ->probe(), and ->remove() support for the 'cxl' bus introduce MODULE_ALIAS_CXL() to autoload modules containing cxl-drivers. Drivers are added in follow-on changes. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162379909190.2993820.6134168109678004186.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6423035f |
|
11-Jun-2021 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/hdm: Fix decoder count calculation The decoder count in the HDM decoder capability structure is an encoded field. As defined in the spec: Decoder Count: Reports the number of memory address decoders implemented by the component. 0 – 1 Decoder 1 – 2 Decoders 2 – 4 Decoders 3 – 6 Decoders 4 – 8 Decoders 5 – 10 Decoders All other values are reserved Nothing is actually fixed by this as nothing actually used this mapping yet. Cc: Ira Weiny <ira.weiny@intel.com> Fixes: 08422378c4ad ("cxl/pci: Add HDM decoder capabilities") Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20210611190111.121295-1-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
40ba17af |
|
09-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Introduce cxl_decoder objects A cxl_decoder is a child of a cxl_port. It represents a hardware decoder configuration of an upstream port to one or more of its downstream ports. The decoder is either represented in CXL standard HDM decoder registers (see CXL 2.0 section 8.2.5.12 CXL HDM Decoder Capability Structure), or it is a static decode configuration communicated by platform firmware (see the CXL Early Discovery Table: Fixed Memory Window Structure). The firmware described and hardware described decoders differ slightly leading to 2 different sub-types of decoders, cxl_decoder_root and cxl_decoder_switch. At the root level the decode capabilities restrict what can be mapped beneath them. Mid-level switch decoders are configured for either acclerator (type-2) or memory-expander (type-3) operation, but they are otherwise agnostic to the type of memory (volatile vs persistent) being mapped. Here is an example topology from a single-ported host-bridge environment without CFMWS decodes enumerated. /sys/bus/cxl/devices/root0 ├── devtype ├── dport0 -> ../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00 ├── port1 │ ├── decoder1.0 │ │ ├── devtype │ │ ├── locked │ │ ├── size │ │ ├── start │ │ ├── subsystem -> ../../../../../../bus/cxl │ │ ├── target_list │ │ ├── target_type │ │ └── uevent │ ├── devtype │ ├── dport0 -> ../../../../pci0000:34/0000:34:00.0 │ ├── subsystem -> ../../../../../bus/cxl │ ├── uevent │ └── uport -> ../../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00 ├── subsystem -> ../../../../bus/cxl ├── uevent └── uport -> ../../ACPI0017:00 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325695128.2293823.17519927266014762694.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7d4b5ca2 |
|
09-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Add downstream port data to cxl_port instances In preparation for infrastructure that enumerates and configures the CXL decode mechanism of an upstream port to its downstream ports, add a representation of a CXL downstream port. On ACPI systems the top-most logical downstream ports in the hierarchy are the host bridges (ACPI0016 devices) that decode the memory windows described by the CXL Early Discovery Table Fixed Memory Window Structures (CEDT.CFMWS). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/162325450624.2293126.3533006409920271718.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4812be97 |
|
09-Jun-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/acpi: Introduce the root of a cxl_port topology While CXL builds upon the PCI software model for enumeration and endpoint control, a static platform component is required to bootstrap the CXL memory layout. Similar to how ACPI identifies root-level PCI memory resources, ACPI data enumerates the address space and interleave configuration for CXL Memory. In addition to identifying host bridges, ACPI is responsible for enumerating the CXL memory space that can be addressed by downstream decoders. This is similar to the requirement for ACPI to publish resources via the _CRS method for PCI host bridges. Specifically, ACPI publishes a table, CXL Early Discovery Table (CEDT), which includes a list of CXL Memory resources, CXL Fixed Memory Window Structures (CFMWS). For now, introduce the core infrastructure for a cxl_port hierarchy starting with a root level anchor represented by the ACPI0017 device. Follow on changes model support for the configurable decode capabilities of cxl_port instances, i.e. CXL switch support. Co-developed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325449515.2293126.15303270193010154608.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
08422378 |
|
27-May-2021 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/pci: Add HDM decoder capabilities An HDM decoder is defined in the CXL 2.0 specification as a mechanism that allow devices and upstream ports to claim memory address ranges and participate in interleave sets. HDM decoder registers are within the component register block defined in CXL 2.0 8.2.3 CXL 2.0 Component Registers as part of the CXL.cache and CXL.mem subregion. The Component Register Block is found via the Register Locator DVSEC in a similar fashion to how the CXL Device Register Block is found. The primary difference is the capability id size of the Component Register Block is a single DWORD instead of 4 DWORDS. It's now possible to configure a CXL type 3 device's HDM decoder. Such programming is expected for CXL devices with persistent memory, and hot plugged CXL devices that participate in CXL.mem with volatile memory. Add probe and mapping functions for the component register blocks. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Co-developed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20210528004922.3980613-6-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
30af9729 |
|
03-Jun-2021 |
Ira Weiny <ira.weiny@intel.com> |
cxl/pci: Map registers based on capabilities The information required to map registers based on capabilities is contained within the bars themselves. This means the bar must be mapped to read the information needed and then unmapped to map the individual parts of the BAR based on capabilities. Change cxl_setup_device_regs() to return a new cxl_register_map, change the name to cxl_probe_device_regs(). Allocate and place cxl_register_maps on a list while processing all of the specified register blocks. After probing all the register blocks go back and map smaller registers blocks based on their capabilities and dispose of the cxl_register_maps. NOTE: pci_iomap() is not managed automatically via pcim_enable_device() so be careful to call pci_iounmap() correctly. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20210604005036.4187184-1-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
399d34eb |
|
13-May-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/core: Refactor CXL register lookup for bridge reuse While CXL Memory Device endpoints locate the CXL MMIO registers in a PCI BAR, CXL root bridges have their MMIO base address described by platform firmware. Refactor the existing register lookup into a generic facility for endpoints and bridges to share. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162096972534.1865304.3218686216153688039.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8ac75dd6 |
|
13-May-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Introduce 'struct cxl_regs' for "composable" CXL devices CXL MMIO register blocks are organized by device type and capabilities. There are Component registers, Device registers (yes, an ambiguous name), and Memory Device registers (a specific extension of Device registers). It is possible for a given device instance (endpoint or port) to implement register sets from multiple of the above categories. The driver code that enumerates and maps the registers is type specific so it is useful to have a dedicated type and helpers for each block type. At the same time, once the registers are mapped the origin type does not matter. It is overly pedantic to reference the register block type in code that is using the registers. In preparation for the endpoint driver to incorporate Component registers into its MMIO operations reorganize the registers to allow typed enumeration + mapping, but anonymous usage. With the end state of 'struct cxl_regs' to be: struct cxl_regs { union { struct { CXL_DEVICE_REGS(); }; struct cxl_device_regs device_regs; }; union { struct { CXL_COMPONENT_REGS(); }; struct cxl_component_regs component_regs; }; }; With this arrangement the driver can share component init code with ports, but when using the registers it can directly reference the component register block type by name without the 'component_regs' prefix. So, map + enumerate can be shared across drivers of different CXL classes e.g.: void cxl_setup_device_regs(struct device *dev, void __iomem *base, struct cxl_device_regs *regs); void cxl_setup_component_regs(struct device *dev, void __iomem *base, struct cxl_component_regs *regs); ...while inline usage in the driver need not indicate where the registers came from: readl(cxlm->regs.mbox + MBOX_OFFSET); readl(cxlm->regs.hdm + HDM_OFFSET); ...instead of: readl(cxlm->regs.device_regs.mbox + MBOX_OFFSET); readl(cxlm->regs.component_regs.hdm + HDM_OFFSET); This complexity of the definition in .h yields improvement in code readability in .c while maintaining type-safety for organization of setup code. It prepares the implementation to maintain organization in the face of CXL devices that compose register interfaces consisting of multiple types. Given that this new container is named 'regs' rename the common register base pointer @base, and fixup the kernel-doc for the missing @cxlmd description. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/162096971451.1865304.13540251513463515153.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5f50d6b2 |
|
13-May-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Move some definitions to mem.h In preparation for sharing cxl.h with other generic CXL consumers, move / consolidate some of the memory device specifics to mem.h. The motivation for moving out of cxl.h is to maintain least privilege access to memory-device details since cxl.h is used in multiple files. The motivation for moving definitions into a new mem.h header is for code readability and organization. I.e. minimize implementation details when reading data structures and other definitions. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162096970932.1865304.14510894426562947262.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
472b1ce6 |
|
16-Feb-2021 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/mem: Enable commands via CEL CXL devices identified by the memory-device class code must implement the Device Command Interface (described in 8.2.9 of the CXL 2.0 spec). While the driver already maintains a list of commands it supports, there is still a need to be able to distinguish between commands that the driver knows about from commands that are optionally supported by the hardware. The Command Effects Log (CEL) is specified in the CXL 2.0 specification. The CEL is one of two types of logs, the other being vendor specific. They are distinguished in hardware/spec via UUID. The CEL is useful for 2 things: 1. Determine which optional commands are supported by the CXL device. 2. Enumerate any vendor specific commands The CEL is used by the driver to determine which commands are available in the hardware and therefore which commands userspace is allowed to execute. The set of enabled commands might be a subset of commands which are advertised in UAPI via CXL_MEM_SEND_COMMAND IOCTL. With the CEL enabling comes a internal flag to indicate a base set of commands that are enabled regardless of CEL. Such commands are required for basic interaction with the hardware and thus can be useful in debug cases, for example if the CEL is corrupted. The implementation leaves the statically defined table of commands and supplements it with a bitmap to determine commands that are enabled. This organization was chosen for the following reasons: - Smaller memory footprint. Doesn't need a table per device. - Reduce memory allocation complexity. - Fixed command IDs to opcode mapping for all devices makes development and debugging easier. - Certain helpers are easily achievable, like cxl_for_each_cmd(). Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v3) Link: https://lore.kernel.org/r/20210217040958.1354670-7-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b39cb105 |
|
16-Feb-2021 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Register CXL memX devices Create the /sys/bus/cxl hierarchy to enumerate: * Memory Devices (per-endpoint control devices) * Memory Address Space Devices (platform address ranges with interleaving, performance, and persistence attributes) * Memory Regions (active provisioned memory from an address space device that is in use as System RAM or delegated to libnvdimm as Persistent Memory regions). For now, only the per-endpoint control devices are registered on the 'cxl' bus. However, going forward it will provide a mechanism to coordinate cross-device interleave. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v2) Link: https://lore.kernel.org/r/20210217040958.1354670-4-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8adaf747 |
|
16-Feb-2021 |
Ben Widawsky <ben.widawsky@intel.com> |
cxl/mem: Find device capabilities Provide enough functionality to utilize the mailbox of a memory device. The mailbox is used to interact with the firmware running on the memory device. The flow is proven with one implemented command, "identify". Because the class code has already told the driver this is a memory device and the identify command is mandatory. CXL devices contain an array of capabilities that describe the interactions software can have with the device or firmware running on the device. A CXL compliant device must implement the device status and the mailbox capability. Additionally, a CXL compliant memory device must implement the memory device capability. Each of the capabilities can [will] provide an offset within the MMIO region for interacting with the CXL device. The capabilities tell the driver how to find and map the register space for CXL Memory Devices. The registers are required to utilize the CXL spec defined mailbox interface. The spec outlines two mailboxes, primary and secondary. The secondary mailbox is earmarked for system firmware, and not handled in this driver. Primary mailboxes are capable of generating an interrupt when submitting a background command. That implementation is saved for a later time. Reported-by: Colin Ian King <colin.king@canonical.com> (coverity) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> (smatch) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Link: https://www.computeexpresslink.org/download-the-specification Link: https://lore.kernel.org/r/20210217040958.1354670-3-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|