#
7dab174e |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
dax/hmem: Move hmem device registration to dax_hmem.ko In preparation for the CXL region driver to take over the responsibility of registering device-dax instances for CXL regions, move the registration of "hmem" devices to dax_hmem.ko. Previously the builtin component of this enabling (drivers/dax/hmem/device.o) would register platform devices for each address range and trigger the dax_hmem.ko module to load and attach device-dax instances to those devices. Now, the ranges are collected from the HMAT and EFI memory map walking, but the device creation is deferred. A new "hmem_platform" device is created which triggers dax_hmem.ko to load and register the platform devices. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602002771.1924368.5653558226424530127.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
fe098574 |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
dax/hmem: Convey the dax range via memregion_info() In preparation for hmem platform devices to be unregistered, stop using platform_device_add_resources() to convey the address range. The platform_device_add_resources() API causes an existing "Soft Reserved" iomem resource to be re-parented under an inserted platform device resource. When that platform device is deleted it removes the platform device resource and all children. Instead, it is sufficient to convey just the address range and let request_mem_region() insert resources to indicate the devices active in the range. This allows the "Soft Reserved" resource to be re-enumerated upon the next probe event. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167602002217.1924368.7036275892522551624.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
df2798bc |
|
10-Feb-2023 |
Dan Williams <dan.j.williams@intel.com> |
dax/hmem: Move HMAT and Soft reservation probe initcall level In preparation for moving more filtering of "hmem" ranges into the dax_hmem.ko module, update the initcall levels. HMAT range registration moves to subsys_initcall() to be done before Soft Reservation probing, and Soft Reservation probing is moved to device_initcall() to be done before dax_hmem.ko initialization if it is built-in. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167602001107.1924368.11562316181038595611.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
472faf72 |
|
19-Nov-2022 |
Dan Williams <dan.j.williams@intel.com> |
device-dax: Fix duplicate 'hmem' device registration So called "soft-reserved" memory is an EFI conventional memory range with the EFI_MEMORY_SP attribute set. That attribute indicates that the memory is not part of the platform general purpose memory pool and may want some consideration from the system administrator about whether to keep that memory set aside for dedicated access through device-dax (map a device file), or assigned to the page allocator as another general purpose memory node target. Absent an ACPI HMAT table the default device-dax registration creates coarse grained devices that are delineated by EFI Memory Map entries. With the HMAT the devices are delineated by the finer grained ranges associated with the proximity domain of the memory target. I.e. the HMAT describes the properties of performance differentiated memory and each unique performance description results in a unique target proximity domain where each memory proximity domain has an associated SRAT entry that delineates the address range. The intent was that SRAT-defined device-dax instances are registered first. Then any left-over address range with the EFI_MEMORY_SP attribute, but not covered by the SRAT, would have a coarse grained device-dax instance established. However, the scheme to detect what ranges are left to be assigned to a device was buggy and resulted in multiple overlapping device-dax instances. Fix this by using explicit tracking for which ranges have been handled. Now, this new approach may leave memory stranded in the presence of broken platform firmware that fails to fully describe all EFI_MEMORY_SP ranges in the HMAT. That requires a deeper fix if it becomes a problem in practice. Reported-by: "Tallam Mahendra Kumar" <tallam.mahendra.kumar@intel.com> Reported-by: Mustafa Hajeer <mustafa.hajeer@intel.com> Debugged-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166890823379.4183293.15333502171004313377.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6a02124c |
|
13-Sep-2022 |
Lin Yujun <linyujun809@huawei.com> |
ACPI: HMAT: Release platform device in case of platform_device_add_data() fails The platform device is not released when platform_device_add_data() fails. And platform_device_put() perfom one more pointer check than put_device() to check for errors in the 'pdev' pointer. Use platform_device_put() to release platform device in platform_device_add()/platform_device_add_data()/ platform_device_add_resources() error case. Fixes: c01044cc8191 ("ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device") Signed-off-by: Lin Yujun <linyujun809@huawei.com> Link: https://lore.kernel.org/r/20220914033755.99924-1-linyujun809@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
67feaba4 |
|
23-Sep-2022 |
Dan Williams <dan.j.williams@intel.com> |
devdax: Fix soft-reservation memory description The "hmem" platform-devices that are created to represent the platform-advertised "Soft Reserved" memory ranges end up inserting a resource that causes the iomem_resource tree to look like this: 340000000-43fffffff : hmem.0 340000000-43fffffff : Soft Reserved 340000000-43fffffff : dax0.0 This is because insert_resource() reparents ranges when they completely intersect an existing range. This matters because code that uses region_intersects() to scan for a given IORES_DESC will only check that top-level 'hmem.0' resource and not the 'Soft Reserved' descendant. So, to support EINJ (via einj_error_inject()) to inject errors into memory hosted by a dax-device, be sure to describe the memory as IORES_DESC_SOFT_RESERVED. This is a follow-on to: commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP") ...that fixed EINJ support for "Soft Reserved" ranges in the first instance. Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration") Reported-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Tested-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Cc: <stable@vger.kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Omar Avelar <omar.avelar@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mark Gross <markgross@kernel.org> Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5ccac54f |
|
13-Oct-2020 |
Dan Williams <dan.j.williams@intel.com> |
ACPI: HMAT: attach a device for each soft-reserved range The hmem enabling in commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") only registered ranges to the hmem driver for each soft-reservation that also appeared in the HMAT. While this is meant to encourage platform firmware to "do the right thing" and publish an HMAT, the corollary is that platforms that fail to publish an accurate HMAT will strand memory from Linux usage. Additionally, the "efi_fake_mem" kernel command line option enabling will strand memory by default without an HMAT. Arrange for "soft reserved" memory that goes unclaimed by HMAT entries to be published as raw resource ranges for the hmem driver to consume. Include a module parameter to disable either this fallback behavior, or the hmat enabling from creating hmem devices. The module parameter requires the hmem device enabling to have unique name in the module namespace: "device_hmem". The driver depends on the architecture providing phys_to_target_node() which is only x86 via numa_meminfo() and arm64 via a generic memblock implementation. [joao.m.martins@oracle.com: require NUMA_KEEP_MEMINFO for phys_to_target_node()] Link: https://lkml.kernel.org/r/aaae71a7-4846-f5cc-5acf-cf05fdb1f2dc@oracle.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jia He <justin.he@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Link: https://lkml.kernel.org/r/159643098298.4062302.17587338161136144730.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c01044cc |
|
13-Oct-2020 |
Dan Williams <dan.j.williams@intel.com> |
ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device In preparation for exposing "Soft Reserved" memory ranges without an HMAT, move the hmem device registration to its own compilation unit and make the implementation generic. The generic implementation drops usage acpi_map_pxm_to_online_node() that was translating ACPI proximity domain values and instead relies on numa_map_to_online_node() to determine the numa node for the device. [joao.m.martins@oracle.com: CONFIG_DEV_DAX_HMEM_DEVICES should depend on CONFIG_DAX=y] Link: https://lkml.kernel.org/r/8f34727f-ec2d-9395-cb18-969ec8a5d0d4@oracle.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Link: https://lkml.kernel.org/r/159643096584.4062302.5035370788475153738.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/158318761484.2216124.2049322072599482736.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|