History log of /linux-master/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
Revision Date Author Comments
# fb700810 13-Feb-2024 James Morse <james.morse@arm.com>

x86/resctrl: Separate arch and fs resctrl locks

resctrl has one mutex that is taken by the architecture-specific code, and the
filesystem parts. The two interact via cpuhp, where the architecture code
updates the domain list. Filesystem handlers that walk the domains list should
not run concurrently with the cpuhp callback modifying the list.

Exposing a lock from the filesystem code means the interface is not cleanly
defined, and creates the possibility of cross-architecture lock ordering
headaches. The interaction only exists so that certain filesystem paths are
serialised against CPU hotplug. The CPU hotplug code already has a mechanism to
do this using cpus_read_lock().

MPAM's monitors have an overflow interrupt, so it needs to be possible to walk
the domains list in irq context. RCU is ideal for this, but some paths need to
be able to sleep to allocate memory.

Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp
callback, cpus_read_lock() must always be taken first.
rdtgroup_schemata_write() already does this.

Most of the filesystem code's domain list walkers are currently protected by
the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are
rdt_bit_usage_show() and the mon_config helpers which take the lock directly.

Make the domain list protected by RCU. An architecture-specific lock prevents
concurrent writers. rdt_bit_usage_show() could walk the domain list using RCU,
but to keep all the filesystem operations the same, this is changed to call
cpus_read_lock(). The mon_config helpers send multiple IPIs, take the
cpus_read_lock() in these cases.

The other filesystem list walkers need to be able to sleep. Add
cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't
be invoked when file system operations are occurring.

Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live()
call isn't obvious.

Resctrl's domain online/offline calls now need to take the rdtgroup_mutex
themselves.

[ bp: Fold in a build fix: https://lore.kernel.org/r/87zfvwieli.ffs@tglx ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-25-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>


# 978fcca9 13-Feb-2024 James Morse <james.morse@arm.com>

x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU

When a CPU is taken offline resctrl may need to move the overflow or limbo
handlers to run on a different CPU.

Once the offline callbacks have been split, cqm_setup_limbo_handler() will be
called while the CPU that is going offline is still present in the CPU mask.

Pass the CPU to exclude to cqm_setup_limbo_handler() and
mbm_setup_overflow_handler(). These functions can use a variant of
cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need
excluding.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-22-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>


# e557999f 13-Feb-2024 James Morse <james.morse@arm.com>

x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()

Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter value. Allocating a
contended resource may involve sleeping.

__check_limbo() and mon_event_count() each make multiple calls to
resctrl_arch_rmid_read(), to avoid extra work on contended systems,
the allocation should be valid for multiple invocations of
resctrl_arch_rmid_read().

The memory or hardware allocated is not specific to a domain.

Add arch hooks for this allocation, which need calling before
resctrl_arch_rmid_read(). The allocated monitor is passed to
resctrl_arch_rmid_read(), then freed again afterwards. The helper
can be called on any CPU, and can sleep.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-16-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>


# 09909e09 13-Feb-2024 James Morse <james.morse@arm.com>

x86/resctrl: Queue mon_event_read() instead of sending an IPI

Intel is blessed with an abundance of monitors, one per RMID, that can be
read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
the number implemented is up to the manufacturer. This means when there are
fewer monitors than needed, they need to be allocated and freed.

MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
CSU counter is allowed to return 'not ready' for a small number of
micro-seconds after programming. To allow one CSU hardware monitor to be
used for multiple control or monitor groups, the CPU accessing the
monitor needs to be able to block when configuring and reading the
counter.

Worse, the domain may be broken up into slices, and the MMIO accesses
for each slice may need performing from different CPUs.

These two details mean MPAMs monitor code needs to be able to sleep, and
IPI another CPU in the domain to read from a resource that has been sliced.

mon_event_read() already invokes mon_event_count() via IPI, which means
this isn't possible. On systems using nohz-full, some CPUs need to be
interrupted to run kernel work as they otherwise stay in user-space
running realtime workloads. Interrupting these CPUs should be avoided,
and scheduling work on them may never complete.

Change mon_event_read() to pick a housekeeping CPU, (one that is not using
nohz_full) and schedule mon_event_count() and wait. If all the CPUs
in a domain are using nohz-full, then an IPI is used as the fallback.

This function is only used in response to a user-space filesystem request
(not the timing sensitive overflow code).

This allows MPAM to hide the slice behaviour from resctrl, and to keep
the monitor-allocation in monitor.c. When the IPI fallback is used on
machines where MPAM needs to make an access on multiple CPUs, the counter
read will always fail.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-14-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>


# 0e3cd31f 09-Oct-2023 Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>

x86/resctrl: Enable non-contiguous CBMs in Intel CAT

The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file will fail before resctrl passes
the value to the hardware.

In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
being reserved and now carry information about non-contiguous 1s
value support for L3 and L2 cache respectively. The CAT
capacity bitmask (CBM) supports a non-contiguous 1s value if
the bit is set.

The exception are Haswell systems where non-contiguous 1s value
support needs to stay disabled since they can't make use of CPUID
for Cache allocation.

Originally-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Link: https://lore.kernel.org/r/1849b487256fe4de40b30f88450cba3d9abc9171.1696934091.git.maciej.wieczor-retman@intel.com


# 39c6eed1 09-Oct-2023 Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>

x86/resctrl: Rename arch_has_sparse_bitmaps

Rename arch_has_sparse_bitmaps to arch_has_sparse_bitmasks to ensure
consistent terminology throughout resctrl.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Link: https://lore.kernel.org/r/e330fcdae873ef1a831e707025a4b70fa346666e.1696934091.git.maciej.wieczor-retman@intel.com


# 0424a7df 17-Jan-2023 Shawn Wang <shawnwang@linux.alibaba.com>

x86/resctrl: Clear staged_config[] before and after it is used

As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.

Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}

An error occurs when creating resource group named p8:
unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Call Trace:
<IRQ>
__flush_smp_call_function_queue+0x11d/0x170
__sysvec_call_function+0x24/0xd0
sysvec_call_function+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_call_function+0x16/0x20

When creating a new resource control group, hardware will be configured
by the following process:
rdtgroup_mkdir()
rdtgroup_mkdir_ctrl_mon()
rdtgroup_init_alloc()
resctrl_arch_update_domains()

resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.

Fix it by clearing staged_config[] before and after it is used.

[reinette: re-order commit tags]

Fixes: 75408e43509e ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com


# 5b6fac3f 13-Jan-2023 Babu Moger <babu.moger@amd.com>

x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation

The QoS slow memory configuration details are available via
CPUID_Fn80000020_EDX_x02. Detect the available details and
initialize the rest to defaults.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20230113152039.770054-7-babu.moger@amd.com


# fc3b618c 13-Jan-2023 Babu Moger <babu.moger@amd.com>

x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()

on_each_cpu_mask() runs the function on each CPU specified by cpumask,
which may include the local processor.

Replace smp_call_function_many() with on_each_cpu_mask() to simplify
the code.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20230113152039.770054-2-babu.moger@amd.com


# 2d4daa54 27-Sep-2022 Babu Moger <babu.moger@amd.com>

x86/resctrl: Remove arch_has_empty_bitmaps

The field arch_has_empty_bitmaps is not required anymore. The field
min_cbm_bits is enough to validate the CBM (capacity bit mask) if the
architecture can support the zero CBM or not.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/166430979654.372014.615622285687642644.stgit@bmoger-ubuntu


# f7b1843e 02-Sep-2022 James Morse <james.morse@arm.com>

x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes

resctrl_arch_rmid_read() returns a value in chunks, as read from the
hardware. This needs scaling to bytes by mon_scale, as provided by
the architecture code.

Now that resctrl_arch_rmid_read() performs the overflow and corrections
itself, it may as well return a value in bytes directly. This allows
the accesses to the architecture specific 'hw' structure to be removed.

Move the mon_scale conversion into resctrl_arch_rmid_read().
mbm_bw_count() is updated to calculate bandwidth from bytes.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-22-james.morse@arm.com


# 4d044c52 02-Sep-2022 James Morse <james.morse@arm.com>

x86/resctrl: Abstract __rmid_read()

__rmid_read() selects the specified eventid and returns the counter
value from the MSR. The error handling is architecture specific, and
handled by the callers, rdtgroup_mondata_show() and __mon_event_count().

Error handling should be handled by architecture specific code, as
a different architecture may have different requirements. MPAM's
counters can report that they are 'not ready', requiring a second
read after a short delay. This should be hidden from resctrl.

Make __rmid_read() the architecture specific function for reading
a counter. Rename it resctrl_arch_rmid_read() and move the error
handling into it.

A read from a counter that hardware supports but resctrl does not
now returns -EINVAL instead of -EIO from the default case in
__mon_event_count(). It isn't possible for user-space to see this
change as resctrl doesn't expose counters it doesn't support.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-16-james.morse@arm.com


# ff6357bb 02-Sep-2022 James Morse <james.morse@arm.com>

x86/resctrl: Allow update_mba_bw() to update controls directly

update_mba_bw() calculates a new control value for the MBA resource
based on the user provided mbps_val and the current measured
bandwidth. Some control values need remapping by delay_bw_map().

It does this by calling wrmsrl() directly. This needs splitting
up to be done by an architecture specific helper, so that the
remainder can eventually be moved to /fs/.

Add resctrl_arch_update_one() to apply one configuration value
to the provided resource and domain. This avoids the staging
and cross-calling that is only needed with changes made by
user-space. delay_bw_map() moves to be part of the arch code,
to maintain the 'percentage control' view of MBA resources
in resctrl.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-12-james.morse@arm.com


# 6ce1560d 02-Sep-2022 James Morse <james.morse@arm.com>

x86/resctrl: Switch over to the resctrl mbps_val list

Updates to resctrl's software controller follow the same path as
other configuration updates, but they don't modify the hardware state.
rdtgroup_schemata_write() uses parse_line() and the resource's
parse_ctrlval() function to stage the configuration.
resctrl_arch_update_domains() then updates the mbps_val[] array
instead, and resctrl_arch_update_domains() skips the rdt_ctrl_update()
call that would update hardware.

This complicates the interface between resctrl's filesystem parts
and architecture specific code. It should be possible for mba_sc
to be completely implemented by the filesystem parts of resctrl. This
would allow it to work on a second architecture with no additional code.
resctrl_arch_update_domains() using the mbps_val[] array prevents this.

Change parse_bw() to write the configuration value directly to the
mbps_val[] array in the domain structure. Change rdtgroup_schemata_write()
to skip the call to resctrl_arch_update_domains(), meaning all the
mba_sc specific code in resctrl_arch_update_domains() can be removed.
On the read-side, show_doms() and update_mba_bw() are changed to read
the mbps_val[] array from the domain structure. With this,
resctrl_arch_get_config() no longer needs to consider mba_sc resources.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-10-james.morse@arm.com


# 111136e6 11-Aug-2021 James Morse <james.morse@arm.com>

x86/resctrl: Make resctrl_arch_get_config() return its value

resctrl_arch_get_config() has no return, but does pass a single value
back via one of its arguments.

Return the value instead.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210811163831.14917-1-james.morse@arm.com


# 327364d5 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range

resctrl_arch_update_domains() specifies the one closid that has been
modified and needs copying to the hardware.

resctrl_arch_update_domains() takes a struct rdt_resource and a closid
as arguments, but copies all the staged configurations for that closid
into the ctrl_val[] array.

resctrl_arch_update_domains() is called once per schema, but once the
resources and domains are merged, the second call of a L2CODE/L2DATA
pair will find no staged configurations, as they were previously
applied. The msr_param of the first call only has one index, so would
only have update the hardware for the last staged configuration.

To avoid a second round of IPIs when changing L2CODE and L2DATA in one
go, expand the range of the msr_param if multiple staged configurations
are found.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-24-james.morse@arm.com


# 2b8dd4ab 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Calculate the index from the configuration type

resctrl uses cbm_idx() to map a closid to an index in the configuration
array. This is based on a multiplier and offset that are held in the
resource.

To merge the resources, the resctrl arch code needs to calculate the
index from something else, as there will only be one resource.

Decide based on the staged configuration type. This makes the static
mult and offset parameters redundant.

[ bp: Remove superfluous brackets in get_config_index() ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-21-james.morse@arm.com


# 2e7df368 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Apply offset correction when config is staged

When resctrl comes to copy the CAT MSR values from the ctrl_val[] array
into hardware, it applies an offset adjustment based on the type of
the resource. CODE and DATA resources have their closid mapped into an
odd/even range. This mapping is based on a property of the resource.

This happens once the new control value has been written to the ctrl_val[]
array. Once the CDP resources are merged, there will only be a single
property that needs to cover both odd/even mappings to the single
ctrl_val[] array. The offset adjustment must be applied before the new
value is written to the array.

Move the logic from cat_wrmsr() to resctrl_arch_update_domains(). The
value provided to apply_config() is now an index in the array, not the
closid. The parameters provided via struct msr_param are now indexes
too. As resctrl's use of closid is a u32, struct msr_param's type is
changed to match.

With this, the CODE and DATA resources only use the odd or even
indexes in the array. This allows the temporary num_closid/2 fixes in
domain_setup_ctrlval() and reset_all_ctrls() to be removed.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-20-james.morse@arm.com


# fa8f711d 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Pass configuration type to resctrl_arch_get_config()

The ctrl_val[] array for a struct rdt_hw_resource only holds
configurations of one type. The type is implicit.

Once the CDP resources are merged, the ctrl_val[] array will hold all
the configurations for the hardware resource. When a particular type of
configuration is needed, it must be specified explicitly.

Pass the expected type from the schema into resctrl_arch_get_config().
Nothing uses this yet, but once a single ctrl_val[] array is used for
the three struct rdt_hw_resources that share hardware, the type will be
used to return the correct configuration value from the shared array.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-18-james.morse@arm.com


# f07e9d02 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Add a helper to read a closid's configuration

Functions like show_doms() reach into the architecture's private
structure to retrieve the configuration from the struct rdt_hw_resource.

The hardware configuration may look completely different to the
values resctrl gets from user-space. The staged configuration and
resctrl_arch_update_domains() allow the architecture to convert or
translate these values.

Resctrl shouldn't read or write the ctrl_val[] values directly. Add
a helper to read the current configuration. This will allow another
architecture to scale the bitmaps if necessary, and possibly use
controls that don't take the user-space control format at all.

Of the remaining functions that access ctrl_val[] directly,
apply_config() is part of the architecture-specific code, and is
called via resctrl_arch_update_domains(). reset_all_ctrls() will be an
architecture specific helper.

update_mba_bw() manipulates both ctrl_val[], mbps_val[] and the
hardware. The mbps_val[] that matches the mba_sc state of the resource
is changed, but the other is left unchanged. Abstracting this is the
subject of later patches that affect set_mba_sc() too.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-17-james.morse@arm.com


# 2e667819 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Rename update_domains() to resctrl_arch_update_domains()

update_domains() merges the staged configuration changes into the arch
codes configuration array. Rename to make it clear it is part of the
arch code interface to resctrl.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-16-james.morse@arm.com


# 75408e43 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Allow different CODE/DATA configurations to be staged

Before the CDP resources can be merged, struct rdt_domain will need an
array of struct resctrl_staged_config, one per type of configuration.

Use the type as an index to the array to ensure that a schema
configuration string can't specify the same domain twice. This will
allow two schemata to apply configuration changes to one resource.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-15-james.morse@arm.com


# e8f72825 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Group staged configuration into a separate struct

When configuration changes are made, the new value is written to struct
rdt_domain's new_ctrl field and the have_new_ctrl flag is set. Later
new_ctrl is copied to hardware by a call to update_domains().

Once the CDP resources are merged, there will be one new_ctrl field in
use by two struct resctrl_schema requiring a per-schema IPI to copy the
value to hardware.

Move new_ctrl and have_new_ctrl into a new struct resctrl_staged_config.
Before the CDP resources can be merged, struct rdt_domain will need an
array of these, one per type of configuration. Using the type as an
index to the array will ensure that a schema configuration string can't
specify the same domain twice.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-14-james.morse@arm.com


# e198fde3 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Move the schemata names into struct resctrl_schema

resctrl 'info' directories and schema parsing use the schema name.
This lives in the struct rdt_resource, and is specified by the
architecture code.

Once the CDP resources are merged, there will only be one resource (and
one name) in use by two schemata. To allow the CDP CODE/DATA property to
be the type of configuration the schema uses, the name should also be
per-schema.

Add a name field to struct resctrl_schema, and use this wherever
the schema name is exposed (or read from) user-space. Calculating
max_name_width for padding the schemata file also moves as this is
visible to user-space. As the names in struct rdt_resource already
include the CDP information, schemata_list_create() copies them.

schemata_list_create() includes the length of the CDP suffix when
calculating max_name_width in preparation for CDP resources being
merged.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-13-james.morse@arm.com


# 32150edd 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_region

struct pseudo_lock_region points to the rdt_resource.

Once the resources are merged, this won't be unique. The resource name
is moving into the schema, so that the filesystem portions of resctrl can
generate it.

Swap pseudo_lock_region's rdt_resource pointer for a schema pointer.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-11-james.morse@arm.com


# 1c290682 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Pass the schema to resctrl filesystem functions

Once the CDP resources are merged, there will be two struct
resctrl_schema for one struct rdt_resource. CDP becomes a type of
configuration that belongs to the schema.

Helpers like rdtgroup_cbm_overlaps() need access to the schema to query
the configuration (or configurations) based on schema properties.

Change these functions to take a struct schema instead of the struct
rdt_resource. All the modified functions are part of the filesystem code
that will move to /fs/resctrl once it is possible to support a second
architecture.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-10-james.morse@arm.com


# 3183e87c 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Store the effective num_closid in the schema

Struct resctrl_schema holds properties that vary with the style of
configuration that resctrl applies to a resource. There are already
two values for the hardware's num_closid, depending on whether the
architecture presents the L3 or L3CODE/L3DATA resources.

As the way CDP changes the number of control groups that resctrl can
create is part of the user-space interface, it should be managed by the
filesystem parts of resctrl. This allows the architecture code to only
describe the value the hardware supports.

Add num_closid to resctrl_schema. This is the value seen by the
filesystem, which may be different to the maximum value described by the
arch code when CDP is enabled.

These functions operate on the num_closid value that is exposed to
user-space:

* rdtgroup_parse_resource()
* rdtgroup_schemata_show()
* rdt_num_closids_show()
* closid_init()

Change them to use the schema value instead. schemata_list_create() sets
this value, and reaches into the architecture-specific structure to get
the value. This will eventually be replaced with a helper.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-8-james.morse@arm.com


# 331ebe4c 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Walk the resctrl schema list instead of an arch list

When parsing a schema configuration value from user-space, resctrl walks
the architectures rdt_resources_all[] array to find a matching struct
rdt_resource.

Once the CDP resources are merged there will be one resource in use
by two schemata. Anything walking rdt_resources_all[] on behalf of a
user-space request should walk the list of struct resctrl_schema
instead.

Change the users of for_each_alloc_enabled_rdt_resource() to walk the
schema instead. Schemata were only created for alloc_enabled resources
so these two lists are currently equivalent.

schemata_list_create() and rdt_kill_sb() are ignored. The first
creates the schema list, and will eventually loop over the resource
indexes using an arch helper to retrieve the resource. rdt_kill_sb()
will eventually make use of an arch 'reset everything' helper.

After the filesystem code is moved, rdtgroup_pseudo_locked_in_hierarchy()
remains part of the x86 specific hooks to support pseudo lock. This
code walks each domain, and still does this after the separate resources
are merged.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-7-james.morse@arm.com


# 792e0f6f 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Split struct rdt_domain

resctrl is the defacto Linux ABI for SoC resource partitioning features.

To support it on another architecture, it needs to be abstracted from
the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
struct rdt_domain contains a mix of architecture private details and
properties of the filesystem interface user-space uses.

Continue by splitting struct rdt_domain, into an architecture private
'hw' struct, which contains the common resctrl structure that would be
used by any architecture. The hardware values in ctrl_val and mbps_val
need to be accessed via helpers to allow another architecture to convert
these into a different format if necessary. After this split, filesystem
code paths touching a 'hw' struct indicates where an abstraction is
needed.

Splitting this structure only moves types around, and should not lead
to any change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-3-james.morse@arm.com


# 63c8b123 28-Jul-2021 James Morse <james.morse@arm.com>

x86/resctrl: Split struct rdt_resource

resctrl is the defacto Linux ABI for SoC resource partitioning features.

To support it on another architecture, it needs to be abstracted from
the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
struct rdt_resource contains a mix of architecture private details
and properties of the filesystem interface user-space uses.

Start by splitting struct rdt_resource, into an architecture private
'hw' struct, which contains the common resctrl structure that would be
used by any architecture. The foreach helpers are most commonly used by
the filesystem code, and should return the common resctrl structure.
for_each_rdt_resource() is changed to walk the common structure in its
parent arch private structure.

Move as much of the structure as possible into the common structure
in the core code's header file. The x86 hardware accessors remain
part of the architecture private code, as do num_closid, mon_scale
and mbm_width.

mon_scale and mbm_width are used to detect overflow of the hardware
counters, and convert them from their native size to bytes. Any
cross-architecture abstraction should be in terms of bytes, making
these properties private.

The hardware's num_closid is kept in the private structure to force the
filesystem code to use a helper to access it. MPAM would return a single
value for the system, regardless of the resource. Using the helper
prevents this field from being confused with the version of num_closid
that is being exposed to user-space (added in a later patch).

After this split, filesystem code touching a 'hw' struct indicates
where an abstraction is needed.

Splitting this structure only moves types around, and should not lead
to any change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20210728170637.25610-2-james.morse@arm.com


# 316e7f90 08-Jul-2020 James Morse <james.morse@arm.com>

x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps

Intel CPUs expect the cache bitmap provided by user-space to have on a
single span of 1s, whereas AMD can support bitmaps like 0xf00f. Arm's
MPAM support also allows sparse bitmaps.

Similarly, Intel CPUs check at least one bit set, whereas AMD CPUs are
quite happy with an empty bitmap. Arm's MPAM allows an empty bitmap.

To move resctrl out to /fs/, platform differences like this need to be
explained.

Add two resource properties arch_has_{empty,sparse}_bitmaps. Test these
around the relevant parts of cbm_validate().

Merging the validate calls causes AMD to gain the min_cbm_bits test
needed for Haswell, but as it always sets this value to 1, it will never
match.

[ bp: Massage commit message. ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-10-james.morse@arm.com


# 5df3ca93 08-Jul-2020 James Morse <james.morse@arm.com>

x86/resctrl: Merge AMD/Intel parse_bw() calls

Now after arch_needs_linear has been added, the parse_bw() calls are
almost the same between AMD and Intel.

The difference is '!is_mba_sc()', which is not checked on AMD. This
will always be true on AMD CPUs as mba_sc cannot be enabled as
is_mba_linear() is false.

Removing this duplication means user-space visible behaviour and
error messages are not validated or generated in different places.

Reviewed-by : Babu Moger <babu.moger@amd.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-9-james.morse@arm.com


# 41215b79 08-Jul-2020 James Morse <james.morse@arm.com>

x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference

The configuration values user-space provides to the resctrl filesystem
are ABI. To make this work on another architecture, all the ABI bits
should be moved out of /arch/x86 and under /fs.

To do this, the differences between AMD and Intel CPUs needs to be
explained to resctrl via resource properties, instead of function
pointers that let the arch code accept subtly different values on
different platforms/architectures.

For MBA, Intel CPUs reject configuration attempts for non-linear
resources, whereas AMD ignore this field as its MBA resource is never
linear. To merge the parse/validate functions, this difference needs to
be explained.

Add struct rdt_membw::arch_needs_linear to indicate the arch code needs
the linear property to be true to configure this resource. AMD can set
this and delay_linear to false. Intel can set arch_needs_linear to
true to keep the existing "No support for non-linear MB domains" error
message for affected platforms.

[ bp: convert "we" etc to passive voice. ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-8-james.morse@arm.com


# 46637d45 05-May-2020 Reinette Chatre <reinette.chatre@intel.com>

x86/resctrl: Maintain MBM counter width per resource

The original Memory Bandwidth Monitoring (MBM) architectural
definition defines counters of up to 62 bits in the IA32_QM_CTR MSR,
and the first-generation MBM implementation uses 24 bit counters.
Software is required to poll at 1 second or faster to ensure that
data is retrieved before a counter rollover occurs more than once
under worst conditions.

As system bandwidths scale the software requirement is maintained with
the introduction of a per-resource enumerable MBM counter width.

In preparation for supporting hardware with an enumerable MBM counter
width the current globally static MBM counter width is moved to a
per-resource MBM counter width. Currently initialized to 24 always
to result in no functional change.

In essence there is one function, mbm_overflow_count() that needs to
know the counter width to handle rollovers. The static value
used within mbm_overflow_count() will be replaced with a value
discovered from the hardware. Support for learning the MBM counter
width from hardware is added in the change that follows.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/e36743b9800f16ce600f86b89127391f61261f23.1588715690.git.reinette.chatre@intel.com


# 26467b0f 28-Oct-2019 Xiaochen Shen <xiaochen.shen@intel.com>

x86/resctrl: Prevent NULL pointer dereference when reading mondata

When a mon group is being deleted, rdtgrp->flags is set to RDT_DELETED
in rdtgroup_rmdir_mon() firstly. The structure of rdtgrp will be freed
until rdtgrp->waitcount is dropped to 0 in rdtgroup_kn_unlock() later.

During the window of deleting a mon group, if an application calls
rdtgroup_mondata_show() to read mondata under this mon group,
'rdtgrp' returned from rdtgroup_kn_lock_live() is a NULL pointer when
rdtgrp->flags is RDT_DELETED. And then 'rdtgrp' is passed in this path:
rdtgroup_mondata_show() --> mon_event_read() --> mon_event_count().
Thus it results in NULL pointer dereference in mon_event_count().

Check 'rdtgrp' in rdtgroup_mondata_show(), and return -ENOENT
immediately when reading mondata during the window of deleting a mon
group.

Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: pei.p.jia@intel.com
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1572326702-27577-1-git-send-email-xiaochen.shen@intel.com


# 2025cf9e 29-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 47820e73 17-Apr-2019 Xiaochen Shen <xiaochen.shen@intel.com>

x86/resctrl: Initialize a new resource group with default MBA values

Currently, when a new resource group is created, the allocation values
of the MBA resource are not initialized and remain meaningless data.

For example:

mkdir /sys/fs/resctrl/p1
cat /sys/fs/resctrl/p1/schemata
MB:0=100;1=100

echo "MB:0=10;1=20" > /sys/fs/resctrl/p1/schemata
cat /sys/fs/resctrl/p1/schemata
MB:0= 10;1= 20

rmdir /sys/fs/resctrl/p1
mkdir /sys/fs/resctrl/p2
cat /sys/fs/resctrl/p2/schemata
MB:0= 10;1= 20

Therefore, when the new group is created, it is reasonable to initialize
MBA resource with default values.

Initialize the MBA resource and cache resources in separate functions.

[ bp: Add newlines between code blocks for better readability. ]

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: pei.p.jia@intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1555499329-1170-3-git-send-email-xiaochen.shen@intel.com


# 52eb7433 10-Dec-2018 Reinette Chatre <reinette.chatre@intel.com>

x86/resctrl: Fix rdt_find_domain() return value and checks

rdt_find_domain() returns an ERR_PTR() that is generated from a provided
domain id when the value is negative.

Care needs to be taken when creating an ERR_PTR() from this value
because a subsequent check using IS_ERR() expects the error to
be within the MAX_ERRNO range. Using an invalid domain id as an
ERR_PTR() does work at this time since this is currently always -1.
Using this undocumented assumption is fragile since future users of
rdt_find_domain() may not be aware of thus assumption.

Two related issues are addressed:

- Ensure that rdt_find_domain() always returns a valid error value by
forcing the error to be -ENODEV when a negative domain id is provided.

- In a few instances the return value of rdt_find_domain() is just
checked for NULL - fix these to include a check of ERR_PTR.

Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Fixes: 521348b011d6 ("x86/intel_rdt: Introduce utility to obtain CDP peer")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: fenghua.yu@intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/b88cd4ff6a75995bf8db9b0ea546908fe50f69f3.1544479852.git.reinette.chatre@intel.com


# cb74635f 28-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Remove unnecessary check for cbm_validate()

The Smatch static checker reports the following error after commit:

a36c5ff560fb ("x86/resctrl: Bring cbm_validate() into the resource structure"):

arch/x86/kernel/cpu/resctrl/ctrlmondata.c:227 parse_cbm()
error: uninitialized symbol 'cbm_val'.
arch/x86/kernel/cpu/resctrl/ctrlmondata.c:236 parse_cbm()
error: uninitialized symbol 'cbm_val'.

This could happen if ->cbm_validate() is NULL which could leave cbm_val
uninitialized.

However, there is no case where ->cbm_validate() can be NULL as it is
initialized based on a vendor check. So it is either an Intel or an AMD
version it points to. And in both the cases it is initialized properly.
Thus, remove the first check.

Verified the fix running Smatch.

[ bp: massage commit message. ]

Fixes: a36c5ff560fb ("x86/resctrl: Bring cbm_validate() into the resource structure")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181128224234.22998-1-babu.moger@amd.com


# 45682489 27-Nov-2018 Reinette Chatre <reinette.chatre@intel.com>

x86/resctrl: Use rdt_last_cmd_puts() where possible

The last_cmd_status sequence buffer contains user-visible messages
(accessed via /sys/fs/resctrl/info/last_cmd_status) that detail any
errors encountered while interacting with the resctrl filesystem.

rdt_last_cmd_printf() and rdt_last_cmd_puts() are the two calls
available to respectively print a string with format specifiers or a
simple one (which contains no format specifiers) to the last_cmd_status
buffer.

A few occurrences exist where rdt_last_cmd_printf() is used to print
a simple string. Doing so does not result in incorrect result or
incorrect behavior, but rdt_last_cmd_puts() is the function intended to
be used in these cases, as it is faster and it doesn't need to do the
vsnprintf() formatting.

Fix these occurrences to use rdt_last_cmd_puts() instead. While doing
so, fix two typos that were recently introduced into two of these simple
strings.

[ bp: massage commit message and correct typos. ]

Fixes: 723f1a0dd8e2 ("x86/resctrl: Fixup the user-visible strings")
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region")
Fixes: 9ab9aa15c309 ("x86/intel_rdt: Ensure requested schemata respects mode")
Fixes: d48d7a57f718 ("x86/intel_rdt: Introduce resource group's mode resctrl file")
Fixes: dfe9674b04ff ("x86/intel_rdt: Enable entering of pseudo-locksetup mode")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: babu.moger@amd.com
Cc: jithu.joseph@intel.com
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/f48e46a016d6a5c79f13de8faeca382052189e2e.1543346009.git.reinette.chatre@intel.com


# 4d05bf71 21-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Introduce AMD QOS feature

Enable QOS feature on AMD.

Following QoS sub-features are supported on AMD if the underlying
hardware supports it:

- L3 Cache allocation enforcement
- L3 Cache occupancy monitoring
- L3 Code-Data Prioritization support
- Memory Bandwidth Enforcement (Allocation)

The specification is available at:
https://developer.amd.com/wp-content/resources/56375.pdf

There are differences in the way some of the features are implemented.
Separate those functions and add those as vendor specific functions.

The major difference is in MBA feature:

- AMD uses CPUID leaf 0x80000020 to initialize the MBA features.
- AMD uses direct bandwidth value instead of delay based on bandwidth values.
- MSR register base addresses are different for MBA.
- AMD allows non-contiguous L3 cache bit masks.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: <qianyue.zj@alibaba-inc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Rian Hunter <rian@alum.mit.edu>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <xiaochen.shen@intel.com>
Link: https://lkml.kernel.org/r/20181121202811.4492-12-babu.moger@amd.com


# 723f1a0d 21-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Fixup the user-visible strings

Fix the messages in rdt_last_cmd_printf() and rdt_last_cmd_puts() to
make them more meaningful and consistent.

[ bp: s/cpu/CPU/; s/mem\W/memory ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: <qianyue.zj@alibaba-inc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Rian Hunter <rian@alum.mit.edu>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <xiaochen.shen@intel.com>
Link: https://lkml.kernel.org/r/20181121202811.4492-11-babu.moger@amd.com


# a36c5ff5 21-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Bring cbm_validate() into the resource structure

Bring all the functions that are different between the vendors into the
resource structure and initialize them dynamically. Add _intel suffix to
the Intel-specific functions.

cbm_validate() which does cache bitmask validation, differs between the
vendors as AMD allows non-contiguous masks. So, use separate functions
for Intel and AMD.

[ bp: Massage commit message and fixup rdt_resource members' vertical
alignment. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: <qianyue.zj@alibaba-inc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Rian Hunter <rian@alum.mit.edu>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <xiaochen.shen@intel.com>
Link: https://lkml.kernel.org/r/20181121202811.4492-7-babu.moger@amd.com


# 1ad4fa41 21-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Initialize the vendor-specific resource functions

Initialize the resource functions that are different between the
vendors. Some features are initialized differently between the vendors.
Add _intel suffix to Intel-specific functions.

For example, the MBA feature varies significantly between Intel and AMD.
Separate the initialization of these resource functions. That way we can
easily add AMD's functions later.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: <qianyue.zj@alibaba-inc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Rian Hunter <rian@alum.mit.edu>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <xiaochen.shen@intel.com>
Link: https://lkml.kernel.org/r/20181121202811.4492-6-babu.moger@amd.com


# fa7d9493 21-Nov-2018 Babu Moger <Babu.Moger@amd.com>

x86/resctrl: Rename and move rdt files to a separate directory

New generation of AMD processors add support for RDT (or QOS) features.
Together, these features will be called RESCTRL. With more than one
vendors supporting these features, it seems more appropriate to rename
these files.

Create a new directory with the name 'resctrl' and move all the
intel_rdt files to the new directory. This way all the resctrl related
code resides inside one directory.

[ bp: Add SPDX identifier to the Makefile ]

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: <qianyue.zj@alibaba-inc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Rian Hunter <rian@alum.mit.edu>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <xiaochen.shen@intel.com>
Link: https://lkml.kernel.org/r/20181121202811.4492-2-babu.moger@amd.com