#
be482ff9 |
|
11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bhi: Enumerate Branch History Injection (BHI) bug Mitigation for BHI is selected based on the bug enumeration. Add bits needed to enumerate BHI bug. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
0f4a8376 |
|
13-Mar-2024 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/bhi: Define SPEC_CTRL_BHI_DIS_S Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel from userspace BHI attacks without having to manually overwrite the branch history. Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL. Mitigation is enabled later. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
8076fcde |
|
11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/rfds: Mitigate Register File Data Sampling (RFDS) RFDS is a CPU vulnerability that may allow userspace to infer kernel stale data previously used in floating point registers, vector registers and integer registers. RFDS only affects certain Intel Atom processors. Intel released a microcode update that uses VERW instruction to clear the affected CPU buffers. Unlike MDS, none of the affected cores support SMT. Add RFDS bug infrastructure and enable the VERW based mitigation by default, that clears the affected buffers just before exiting to userspace. Also add sysfs reporting and cmdline parameter "reg_file_data_sampling" to control the mitigation. For details see: Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
d7b69b59 |
|
19-Feb-2024 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/sev: Dump SEV_STATUS It is, and will be even more useful in the future, to dump the SEV features enabled according to SEV_STATUS. Do so: [ 0.542753] Memory Encryption Features active: AMD SEV SEV-ES SEV-SNP [ 0.544425] SEV: Status: SEV SEV-ES SEV-SNP DebugSwap Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Link: https://lore.kernel.org/r/20240219094216.GAZdMieDHKiI8aaP3n@fat_crate.local
|
#
cd6df3f3 |
|
05-Dec-2023 |
H. Peter Anvin (Intel) <hpa@zytor.com> |
x86/cpu: Add MSR numbers for FRED configuration Add MSR numbers for the FRED configuration registers per FRED spec 5.0. Originally-by: Megha Dey <megha.dey@intel.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20231205105030.8698-13-xin3.li@intel.com
|
#
216d106c |
|
25-Jan-2024 |
Brijesh Singh <brijesh.singh@amd.com> |
x86/sev: Add SEV-SNP host initialization support The memory integrity guarantees of SEV-SNP are enforced through a new structure called the Reverse Map Table (RMP). The RMP is a single data structure shared across the system that contains one entry for every 4K page of DRAM that may be used by SEV-SNP VMs. The APM Volume 2 section on Secure Nested Paging (SEV-SNP) details a number of steps needed to detect/enable SEV-SNP and RMP table support on the host: - Detect SEV-SNP support based on CPUID bit - Initialize the RMP table memory reported by the RMP base/end MSR registers and configure IOMMU to be compatible with RMP access restrictions - Set the MtrrFixDramModEn bit in SYSCFG MSR - Set the SecureNestedPagingEn and VMPLEn bits in the SYSCFG MSR - Configure IOMMU RMP table entry format is non-architectural and it can vary by processor. It is defined by the PPR document for each respective CPU family. Restrict SNP support to CPU models/families which are compatible with the current RMP table entry format to guard against any undefined behavior when running on other system types. Future models/support will handle this through an architectural mechanism to allow for broader compatibility. SNP host code depends on CONFIG_KVM_AMD_SEV config flag which may be enabled even when CONFIG_AMD_MEM_ENCRYPT isn't set, so update the SNP-specific IOMMU helpers used here to rely on CONFIG_KVM_AMD_SEV instead of CONFIG_AMD_MEM_ENCRYPT. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Link: https://lore.kernel.org/r/20240126041126.1927228-5-michael.roth@amd.com
|
#
765a0542 |
|
08-Dec-2023 |
Kai Huang <kai.huang@intel.com> |
x86/virt/tdx: Detect TDX during kernel boot Intel Trust Domain Extensions (TDX) protects guest VMs from malicious host and certain physical attacks. A CPU-attested software module called 'the TDX module' runs inside a new isolated memory range as a trusted hypervisor to manage and run protected VMs. Pre-TDX Intel hardware has support for a memory encryption architecture called MKTME. The memory encryption hardware underpinning MKTME is also used for Intel TDX. TDX ends up "stealing" some of the physical address space from the MKTME architecture for crypto-protection to VMs. The BIOS is responsible for partitioning the "KeyID" space between legacy MKTME and TDX. The KeyIDs reserved for TDX are called 'TDX private KeyIDs' or 'TDX KeyIDs' for short. During machine boot, TDX microcode verifies that the BIOS programmed TDX private KeyIDs consistently and correctly programmed across all CPU packages. The MSRs are locked in this state after verification. This is why MSR_IA32_MKTME_KEYID_PARTITIONING gets used for TDX enumeration: it indicates not just that the hardware supports TDX, but that all the boot-time security checks passed. The TDX module is expected to be loaded by the BIOS when it enables TDX, but the kernel needs to properly initialize it before it can be used to create and run any TDX guests. The TDX module will be initialized by the KVM subsystem when KVM wants to use TDX. Detect platform TDX support by detecting TDX private KeyIDs. The TDX module itself requires one TDX KeyID as the 'TDX global KeyID' to protect its metadata. Each TDX guest also needs a TDX KeyID for its own protection. Just use the first TDX KeyID as the global KeyID and leave the rest for TDX guests. If no TDX KeyID is left for TDX guests, disable TDX as initializing the TDX module alone is useless. [ dhansen: add X86_FEATURE, replace helper function ] Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/all/20231208170740.53979-1-dave.hansen%40intel.com
|
#
33744916 |
|
25-Oct-2023 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86/intel: Support branch counters logging The branch counters logging (A.K.A LBR event logging) introduces a per-counter indication of precise event occurrences in LBRs. It can provide a means to attribute exposed retirement latency to combinations of events across a block of instructions. It also provides a means of attributing Timed LBR latencies to events. The feature is first introduced on SRF/GRR. It is an enhancement of the ARCH LBR. It adds new fields in the LBR_INFO MSRs to log the occurrences of events on the GP counters. The information is displayed by the order of counters. The design proposed in this patch requires that the events which are logged must be in a group with the event that has LBR. If there are more than one LBR group, the counters logging information only from the current group (overflowed) are stored for the perf tool, otherwise the perf tool cannot know which and when other groups are scheduled especially when multiplexing is triggered. The user can ensure it uses the maximum number of counters that support LBR info (4 by now) by making the group large enough. The HW only logs events by the order of counters. The order may be different from the order of enabling which the perf tool can understand. When parsing the information of each branch entry, convert the counter order to the enabled order, and store the enabled order in the extension space. Unconditionally reset LBRs for an LBR event group when it's deleted. The logged counter information is only valid for the current LBR group. If another LBR group is scheduled later, the information from the stale LBRs would be otherwise wrongly interpreted. Add a sanity check in intel_pmu_hw_config(). Disable the feature if other counter filters (inv, cmask, edge, in_tx) are set or LBR call stack mode is enabled. (For the LBR call stack mode, we cannot simply flush the LBR, since it will break the call stack. Also, there is no obvious usage with the call stack mode for now.) Only applying the PERF_SAMPLE_BRANCH_COUNTERS doesn't require any branch stack setup. Expose the maximum number of supported counters and the width of the counters into the sysfs. The perf tool can use the information to parse the logged counters in each branch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231025201626.3000228-5-kan.liang@linux.intel.com
|
#
2770d472 |
|
19-Oct-2023 |
Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
KVM: x86: Ignore MSR_AMD64_TW_CFG access Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED + STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP in the guest kernel). This is because Windows tries to set bit 8 in MSR_AMD64_TW_CFG and can't handle receiving a #GP when doing so. Give this MSR the same treatment that commit 2e32b7190641 ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant only. Although apparently it was then needed for Linux guests, not Windows as in this case. With this change, the aforementioned guest setup is able to finish booting successfully. This issue can be reproduced either on a Summit Ridge Ryzen (with just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since EPYC is ordinarily stepping 2). Alternatively, userspace could solve the problem by using MSR filters, but forcing every userspace to define a filter isn't very friendly and doesn't add much, if any, value. The only potential hiccup is if one of these "baremetal-only" MSRs ever requires actual emulation and/or has F/M/S specific behavior. But if that happens, then KVM can still punt *that* handling to userspace since userspace MSR filters "win" over KVM's default handling. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1ce85d9c7c9e9632393816cf19c902e0a3f411f1.1697731406.git.maciej.szmigiero@oracle.com [sean: call out MSR filtering alternative] Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
deedec0a |
|
12-Oct-2023 |
Borislav Petkov <bp@alien8.de> |
x86/cpu: Fix the AMD Fam 17h, Fam 19h, Zen2 and Zen4 MSR enumerations The comments introduced in <asm/msr-index.h> in the merge conflict fixup in: 8f4156d58713 ("Merge branch 'x86/urgent' into perf/core, to resolve conflict") ... aren't right: AMD naming schemes are more complex than implied, family 0x17 is Zen1 and 2, family 0x19 is spread around Zen 3 and 4. So there's indeed four separate MSR namespaces for: MSR_F17H_ MSR_F19H_ MSR_ZEN2_ MSR_ZEN4_ ... and the namespaces cannot be merged. Fix it up. No change in functionality. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
|
#
25e56847 |
|
04-Oct-2023 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/uncore: Add memory controller support Unified Memory Controller (UMC) events were introduced with Zen 4 as a part of the Performance Monitoring Version 2 (PerfMonV2) enhancements. An event is specified using the EventSelect bits and the RdWrMask bits can be used for additional filtering of read and write requests. As of now, a maximum of 12 channels of DDR5 are available on each socket and each channel is controlled by a dedicated UMC. Each UMC, in turn, has its own set of performance monitoring counters. Since the MSR address space for the UMC PERF_CTL and PERF_CTR registers are reused across sockets, uncore groups are created on the basis of socket IDs. Hence, group exclusivity is mandatory while opening events so that events for an UMC can only be opened on CPUs which are on the same socket as the corresponding memory channel. For each socket, the total number of available UMC counters and active memory channels are determined from CPUID leaf 0x80000022 EBX and ECX respectively. Usually, on Zen 4, each UMC has four counters. MSR assignments are determined on the basis of active UMCs. E.g. if UMCs 1, 4 and 9 are active for a given socket, then * UMC 1 gets MSRs 0xc0010800 to 0xc0010807 as PERF_CTLs and PERF_CTRs * UMC 4 gets MSRs 0xc0010808 to 0xc001080f as PERF_CTLs and PERF_CTRs * UMC 9 gets MSRs 0xc0010810 to 0xc0010817 as PERF_CTLs and PERF_CTRs If there are sockets without any online CPUs when the amd_uncore driver is loaded, UMCs for such sockets will not be discoverable since the mechanism relies on executing the CPUID instruction on an online CPU from the socket. Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/b25f391205c22733493abec1ed850b71784edc5f.1696425185.git.sandipan.das@amd.com
|
#
97a5e801 |
|
05-Oct-2023 |
Jithu Joseph <jithu.joseph@intel.com> |
platform/x86/intel/ifs: Store IFS generation number IFS generation number is reported via MSR_INTEGRITY_CAPS. As IFS support gets added to newer CPUs, some differences are expected during IFS image loading and test flows. Define MSR bitmasks to extract and store the generation in driver data, so that driver can modify its MSR interaction appropriately. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/r/20231005195137.3117166-2-jithu.joseph@intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
#
7deda2ce |
|
21-Sep-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
x86/cpu: Clear SVM feature if disabled by BIOS When SVM is disabled by BIOS, one cannot use KVM but the SVM feature is still shown in the output of /proc/cpuinfo. On Intel machines, VMX is cleared by init_ia32_feat_ctl(), so do the same on AMD and Hygon processors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230921114940.957141-1-pbonzini@redhat.com
|
#
f454b18e |
|
06-Oct-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs Fix erratum #1485 on Zen4 parts where running with STIBP disabled can cause an #UD exception. The performance impact of the fix is negligible. Reported-by: René Rebe <rene@exactcode.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: René Rebe <rene@exactcode.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
|
#
1b5277c0 |
|
29-Jun-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/srso: Add SRSO_NO support Add support for the CPUID flag which denotes that the CPU is not affected by SRSO. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
|
#
8974eb58 |
|
12-Jul-2023 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/speculation: Add Gather Data Sampling mitigation Gather Data Sampling (GDS) is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers. Intel processors that support AVX2 and AVX512 have gather instructions that fetch non-contiguous data elements from memory. On vulnerable hardware, when a gather instruction is transiently executed and encounters a fault, stale data from architectural or internal vector registers may get transiently stored to the destination vector register allowing an attacker to infer the stale data using typical side channel techniques like cache timing attacks. This mitigation is different from many earlier ones for two reasons. First, it is enabled by default and a bit must be set to *DISABLE* it. This is the opposite of normal mitigation polarity. This means GDS can be mitigated simply by updating microcode and leaving the new control bit alone. Second, GDS has a "lock" bit. This lock bit is there because the mitigation affects the hardware security features KeyLocker and SGX. It needs to be enabled and *STAY* enabled for these features to be mitigated against GDS. The mitigation is enabled in the microcode by default. Disable it by setting gather_data_sampling=off or by disabling all mitigations with mitigations=off. The mitigation status can be checked by reading: /sys/devices/system/cpu/vulnerabilities/gather_data_sampling Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
522b1d69 |
|
15-Jul-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/cpu/amd: Add a Zenbleed fix Add a fix for the Zen2 VZEROUPPER data corruption bug where under certain circumstances executing VZEROUPPER can cause register corruption or leak data. The optimal fix is through microcode but in the case the proper microcode revision has not been applied, enable a fallback fix using a chicken bit. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
|
#
c68e3d47 |
|
21-Mar-2023 |
Jithu Joseph <jithu.joseph@intel.com> |
x86/include/asm/msr-index.h: Add IFS Array test bits Define MSR bitfields for enumerating support for Array BIST test. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230322003359.213046-5-jithu.joseph@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
#
e7862eda |
|
24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu: Support AMD Automatic IBRS The AMD Zen4 core supports a new feature called Automatic IBRS. It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBRS mitigation resources automatically across CPL transitions. The feature is advertised by CPUID_Fn80000021_EAX bit 8 and is enabled by setting MSR C000_0080 (EFER) bit 21. Enable Automatic IBRS by default if the CPU feature is present. It typically provides greater performance over the incumbent generic retpolines mitigation. Reuse the SPECTRE_V2_EIBRS spectre_v2_mitigation enum. AMD Automatic IBRS and Intel Enhanced IBRS have similar enablement. Add NO_EIBRS_PBRSB to cpu_vuln_whitelist, since AMD Automatic IBRS isn't affected by PBRSB-eIBRS. The kernel command line option spectre_v2=eibrs is used to select AMD Automatic IBRS, if available. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20230124163319.2277355-8-kim.phillips@amd.com
|
#
dc2a3e85 |
|
13-Jan-2023 |
Babu Moger <babu.moger@amd.com> |
x86/resctrl: Add interface to read mbm_total_bytes_config The event configuration can be viewed by the user by reading the configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config. The event configuration settings are domain specific and will affect all the CPUs in the domain. Following are the types of events supported: ==== =========================================================== Bits Description ==== =========================================================== 6 Dirty Victims from the QOS domain to all types of memory 5 Reads to slow memory in the non-local NUMA domain 4 Reads to slow memory in the local NUMA domain 3 Non-temporal writes to non-local NUMA domain 2 Non-temporal writes to local NUMA domain 1 Reads to memory in the non-local NUMA domain 0 Reads to memory in the local NUMA domain ==== =========================================================== By default, the mbm_total_bytes_config is set to 0x7f to count all the event types. For example: $cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 0=0x7f;1=0x7f;2=0x7f;3=0x7f In this case, the event mbm_total_bytes is configured with 0x7f on domains 0 to 3. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20230113152039.770054-10-babu.moger@amd.com
|
#
5b6fac3f |
|
13-Jan-2023 |
Babu Moger <babu.moger@amd.com> |
x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation The QoS slow memory configuration details are available via CPUID_Fn80000020_EDX_x02. Detect the available details and initialize the rest to defaults. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20230113152039.770054-7-babu.moger@amd.com
|
#
0125acda |
|
28-Nov-2022 |
Breno Leitao <leitao@debian.org> |
x86/bugs: Reset speculation control settings on init Currently, x86_spec_ctrl_base is read at boot time and speculative bits are set if Kconfig items are enabled. For example, IBRS is enabled if CONFIG_CPU_IBRS_ENTRY is configured, etc. These MSR bits are not cleared if the mitigations are disabled. This is a problem when kexec-ing a kernel that has the mitigation disabled from a kernel that has the mitigation enabled. In this case, the MSR bits are not cleared during the new kernel boot. As a result, this might have some performance degradation that is hard to pinpoint. This problem does not happen if the machine is (hard) rebooted because the bit will be cleared by default. [ bp: Massage. ] Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221128153148.1129350-1-leitao@debian.org
|
#
38aaf921 |
|
04-Jan-2023 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86: Add Meteor Lake support From PMU's perspective, Meteor Lake is similar to Alder Lake. Both are hybrid platforms, with e-core and p-core. The key differences include: - The e-core supports 2 PDIST GP counters (GP0 & GP1) - New MSRs for the Module Snoop Response Events on the e-core. - New Data Source fields are introduced for the e-core. - There are 8 GP counters for the e-core. - The load latency AUX event is not required for the p-core anymore. - Retire Latency (Support in a separate patch) for both cores. Since most of the code in the intel_pmu_init() should be the same as Alder Lake, to avoid code duplication, share the path with Alder Lake. Add new specific functions of extra_regs, and get_event_constraints to support the OCR events, Module Snoop Response Events and 2 PDIST GP counters on e-core. Add new MTL specific mem_attrs which drops the load latency AUX event. The Data Source field is extended to 4:0, which can contains max 32 sources. The Retire Latency is implemented with a separate patch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230104201349.1451191-2-kan.liang@linux.intel.com
|
#
8c29f016 |
|
17-Jan-2023 |
Nikunj A Dadhania <nikunj@amd.com> |
x86/sev: Add SEV-SNP guest feature negotiation support The hypervisor can enable various new features (SEV_FEATURES[1:63]) and start a SNP guest. Some of these features need guest side implementation. If any of these features are enabled without it, the behavior of the SNP guest will be undefined. It may fail booting in a non-obvious way making it difficult to debug. Instead of allowing the guest to continue and have it fail randomly later, detect this early and fail gracefully. The SEV_STATUS MSR indicates features which the hypervisor has enabled. While booting, SNP guests should ascertain that all the enabled features have guest side implementation. In case a feature is not implemented in the guest, the guest terminates booting with GHCB protocol Non-Automatic Exit(NAE) termination request event, see "SEV-ES Guest-Hypervisor Communication Block Standardization" document (currently at https://developer.amd.com/wp-content/resources/56421.pdf), section "Termination Request". Populate SW_EXITINFO2 with mask of unsupported features that the hypervisor can easily report to the user. More details in the AMD64 APM Vol 2, Section "SEV_STATUS MSR". [ bp: - Massage. - Move snp_check_features() call to C code. Note: the CC:stable@ aspect here is to be able to protect older, stable kernels when running on newer hypervisors. Or not "running" but fail reliably and in a well-defined manner instead of randomly. ] Fixes: cbd3d4f7c4e5 ("x86/sev: Check SEV-SNP features support") Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20230118061943.534309-1-nikunj@amd.com
|
#
97fa21f6 |
|
06-Nov-2022 |
Borislav Petkov <bp@suse.de> |
x86/resctrl: Move MSR defines into msr-index.h msr-index.h should contain all MSRs for easier grepping for MSR numbers when dealing with unchecked MSR access warnings, for example. Move the resctrl ones. Prefix IA32_PQR_ASSOC with "MSR_" while at it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221106212923.20699-1-bp@alien8.de
|
#
7420ae3b |
|
27-Oct-2022 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
x86/intel_epb: Set Alder Lake N and Raptor Lake P normal EPB Intel processors support additional software hint called EPB ("Energy Performance Bias") to guide the hardware heuristic of power management features to favor increasing dynamic performance or conserve energy consumption. Since this EPB hint is processor specific, the same value of hint can result in different behavior across generations of processors. commit 4ecc933b7d1f ("x86: intel_epb: Allow model specific normal EPB value")' introduced capability to update the default power up EPB based on the CPU model and updated the default EPB to 7 for Alder Lake mobile CPUs. The same change is required for other Alder Lake-N and Raptor Lake-P mobile CPUs as the current default of 6 results in higher uncore power consumption. This increase in power is related to memory clock frequency setting based on the EPB value. Depending on the EPB the minimum memory frequency is set by the firmware. At EPB = 7, the minimum memory frequency is 1/4th compared to EPB = 6. This results in significant power saving for idle and semi-idle workload on a Chrome platform. For example Change in power and performance from EPB change from 6 to 7 on Alder Lake-N: Workload Performance diff (%) power diff ---------------------------------------------------- VP9 FHD30 0 (FPS) -218 mw Google meet 0 (FPS) -385 mw This 200+ mw power saving is very significant for mobile platform for battery life and thermal reasons. But as the workload demands more memory bandwidth, the memory frequency will be increased very fast. There is no power savings for such busy workloads. For example: Workload Performance diff (%) from EPB 6 to 7 ------------------------------------------------------- Speedometer 2.0 -0.8 WebGL Aquarium 10K Fish -0.5 Unity 3D 2018 0.2 WebXPRT3 -0.5 There are run to run variations for performance scores for such busy workloads. So the difference is not significant. Add a new define ENERGY_PERF_BIAS_NORMAL_POWERSAVE for EPB 7 and use it for Alder Lake-N and Raptor Lake-P mobile CPUs. This modification is done originally by Jeremy Compostella <jeremy.compostella@intel.com>. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/all/20221027220056.1534264-1-srinivas.pandruvada%40linux.intel.com
|
#
2632daeb |
|
13-Nov-2022 |
Borislav Petkov <bp@suse.de> |
x86/cpu: Restore AMD's DE_CFG MSR after resume DE_CFG contains the LFENCE serializing bit, restore it on resume too. This is relevant to older families due to the way how they do S3. Unify and correct naming while at it. Fixes: e4d0e84e4907 ("x86/cpu/AMD: Make LFENCE a serializing instruction") Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Reported-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ca5b7c0d |
|
11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Add LbrExtV2 branch record support If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected, enable it alongside LBR Freeze on PMI when an event requests branch stack i.e. PERF_SAMPLE_BRANCH_STACK. Each branch record is represented by a pair of registers, LBR From and LBR To. The freeze feature prevents any updates to these registers once a PMC overflows. The contents remain unchanged until the freeze bit is cleared by the PMI handler. The branch records are read and copied to sample data before unfreezing. However, only valid entries are copied. There is no additional register to denote which of the register pairs represent the top of the stack (TOS) since internal register renaming always ensures that the first pair (i.e. index 0) is the one representing the most recent branch and so on. The LBR registers are per-thread resources and are cleared explicitly whenever a new task is scheduled in. There are no special implications on the contents of these registers when transitioning to deep C-states. Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/d3b8500a3627a0d4d0259b005891ee248f248d91.1660211399.git.sandipan.das@amd.com
|
#
b8d1d163 |
|
16-Aug-2022 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/apic: Don't disable x2APIC if locked The APIC supports two modes, legacy APIC (or xAPIC), and Extended APIC (or x2APIC). X2APIC mode is mostly compatible with legacy APIC, but it disables the memory-mapped APIC interface in favor of one that uses MSRs. The APIC mode is controlled by the EXT bit in the APIC MSR. The MMIO/xAPIC interface has some problems, most notably the APIC LEAK [1]. This bug allows an attacker to use the APIC MMIO interface to extract data from the SGX enclave. Introduce support for a new feature that will allow the BIOS to lock the APIC in x2APIC mode. If the APIC is locked in x2APIC mode and the kernel tries to disable the APIC or revert to legacy APIC mode a GP fault will occur. Introduce support for a new MSR (IA32_XAPIC_DISABLE_STATUS) and handle the new locked mode when the LEGACY_XAPIC_DISABLED bit is set by preventing the kernel from trying to disable the x2APIC. On platforms with the IA32_XAPIC_DISABLE_STATUS MSR, if SGX or TDX are enabled the LEGACY_XAPIC_DISABLED will be set by the BIOS. If legacy APIC is required, then it SGX and TDX need to be disabled in the BIOS. [1]: https://aepicleak.com/aepicleak.pdf Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Link: https://lkml.kernel.org/r/20220816231943.1152579-1-daniel.sneddon@linux.intel.com
|
#
2b129932 |
|
02-Aug-2022 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/speculation: Add RSB VM Exit protections tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB stuffing mechanism and a new LFENCE. == Background == Indirect Branch Restricted Speculation (IBRS) was designed to help mitigate Branch Target Injection and Speculative Store Bypass, i.e. Spectre, attacks. IBRS prevents software run in less privileged modes from affecting branch prediction in more privileged modes. IBRS requires the MSR to be written on every privilege level change. To overcome some of the performance issues of IBRS, Enhanced IBRS was introduced. eIBRS is an "always on" IBRS, in other words, just turn it on once instead of writing the MSR on every privilege level change. When eIBRS is enabled, more privileged modes should be protected from less privileged modes, including protecting VMMs from guests. == Problem == Here's a simplification of how guests are run on Linux' KVM: void run_kvm_guest(void) { // Prepare to run guest VMRESUME(); // Clean up after guest runs } The execution flow for that would look something like this to the processor: 1. Host-side: call run_kvm_guest() 2. Host-side: VMRESUME 3. Guest runs, does "CALL guest_function" 4. VM exit, host runs again 5. Host might make some "cleanup" function calls 6. Host-side: RET from run_kvm_guest() Now, when back on the host, there are a couple of possible scenarios of post-guest activity the host needs to do before executing host code: * on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not touched and Linux has to do a 32-entry stuffing. * on eIBRS hardware, VM exit with IBRS enabled, or restoring the host IBRS=1 shortly after VM exit, has a documented side effect of flushing the RSB except in this PBRSB situation where the software needs to stuff the last RSB entry "by hand". IOW, with eIBRS supported, host RET instructions should no longer be influenced by guest behavior after the host retires a single CALL instruction. However, if the RET instructions are "unbalanced" with CALLs after a VM exit as is the RET in #6, it might speculatively use the address for the instruction after the CALL in #3 as an RSB prediction. This is a problem since the (untrusted) guest controls this address. Balanced CALL/RET instruction pairs such as in step #5 are not affected. == Solution == The PBRSB issue affects a wide variety of Intel processors which support eIBRS. But not all of them need mitigation. Today, X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e., eIBRS systems which enable legacy IBRS explicitly. However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT and most of them need a new mitigation. Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT. The lighter-weight mitigation performs a CALL instruction which is immediately followed by a speculative execution barrier (INT3). This steers speculative execution to the barrier -- just like a retpoline -- which ensures that speculation can never reach an unbalanced RET. Then, ensure this CALL is retired before continuing execution with an LFENCE. In other words, the window of exposure is opened at VM exit where RET behavior is troublesome. While the window is open, force RSB predictions sampling for RET targets to a dead end at the INT3. Close the window with the LFENCE. There is a subset of eIBRS systems which are not vulnerable to PBRSB. Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB. Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO. [ bp: Massage, incorporate review comments from Andy Cooper. ] Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
c59a1f10 |
|
11-Apr-2022 |
Like Xu <like.xu@linux.intel.com> |
KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed and general-purpose counters have corresponding bits in IA32_PEBS_ENABLE that enable generation of PEBS records. The general-purpose counter bits start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at bit IA32_PEBS_ENABLE[32]. When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be added to the perf_guest_switch_msr() and atomically switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. Based on whether the platform supports x86_pmu.pebs_ept, it has also refactored the way to add more msrs to arr[] in intel_guest_get_msrs() for extensibility. Originally-by: Andi Kleen <ak@linux.intel.com> Co-developed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Co-developed-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Like Xu <like.xu@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220411101946.20262-8-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
465932db |
|
19-Apr-2022 |
Robert Hoo <robert.hu@linux.intel.com> |
x86/cpu: Add new VMX feature, Tertiary VM-Execution control A new 64-bit control field "tertiary processor-based VM-execution controls", is defined [1]. It's controlled by bit 17 of the primary processor-based VM-execution controls. Different from its brother VM-execution fields, this tertiary VM- execution controls field is 64 bit. So it occupies 2 vmx_feature_leafs, TERTIARY_CTLS_LOW and TERTIARY_CTLS_HIGH. Its companion VMX capability reporting MSR,MSR_IA32_VMX_PROCBASED_CTLS3 (0x492), is also semantically different from its brothers, whose 64 bits consist of all allow-1, rather than 32-bit allow-0 and 32-bit allow-1 [1][2]. Therefore, its init_vmx_capabilities() is a little different from others. [1] ISE 6.2 "VMCS Changes" https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html [2] SDM Vol3. Appendix A.3 Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Message-Id: <20220419153240.11549-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
4af184ee |
|
31-May-2022 |
Len Brown <len.brown@intel.com> |
tools/power turbostat: dump secondary Turbo-Ratio-Limit Intel Performance Hybrid processors have a 2nd MSR describing the turbo limits enforced on the Ecores. Note, TRL and Secondary-TRL are usually R/O information, but on overclock-capable parts, they can be written. Signed-off-by: Len Brown <len.brown@intel.com>
|
#
4ad3278d |
|
08-Jul-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/speculation: Disable RRSBA behavior Some Intel processors may use alternate predictors for RETs on RSB-underflow. This condition may be vulnerable to Branch History Injection (BHI) and intramode-BTI. Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines, eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against such attacks. However, on RSB-underflow, RET target prediction may fallback to alternate predictors. As a result, RET's predicted target may get influenced by branch history. A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback behavior when in kernel mode. When set, RETs will not take predictions from alternate predictors, hence mitigating RETs as well. Support for this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2). For spectre v2 mitigation, when a user selects a mitigation that protects indirect CALLs and JMPs against BHI and intramode-BTI, set RRSBA_DIS_S also to protect RETs for RSB-underflow case. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
d7caac99 |
|
14-Jun-2022 |
Peter Zijlstra <peterz@infradead.org> |
x86/cpu/amd: Add Spectral Chicken Zen2 uarchs have an undocumented, unnamed, MSR that contains a chicken bit for some speculation behaviour. It needs setting. Note: very belatedly AMD released naming; it's now officially called MSR_AMD64_DE_CFG2 and MSR_AMD64_DE_CFG2_SUPPRESS_NOBR_PRED_BIT but shall remain the SPECTRAL CHICKEN. Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
6ad0ad2b |
|
24-Jun-2022 |
Peter Zijlstra <peterz@infradead.org> |
x86/bugs: Report Intel retbleed vulnerability Skylake suffers from RSB underflow speculation issues; report this vulnerability and it's mitigation (spectre_v2=ibrs). [jpoimboe: cleanups, eibrs] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
027bbb88 |
|
19-May-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
KVM: x86/speculation: Disable Fill buffer clear within guests The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an accurate indicator on all CPUs of whether the VERW instruction will overwrite fill buffers. FB_CLEAR enumeration in IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not vulnerable to MDS/TAA, indicating that microcode does overwrite fill buffers. Guests running in VMM environments may not be aware of all the capabilities/vulnerabilities of the host CPU. Specifically, a guest may apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable to MDS/TAA even when the physical CPU is not. On CPUs that enumerate FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS during VMENTER and resetting on VMEXIT. For guests that enumerate FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM will not use FB_CLEAR_DIS. Irrespective of guest state, host overwrites CPU buffers before VMENTER to protect itself from an MMIO capable guest, as part of mitigation for MMIO Stale Data vulnerabilities. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
51802186 |
|
19-May-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug Processor MMIO Stale Data is a class of vulnerabilities that may expose data after an MMIO operation. For more details please refer to Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst Add the Processor MMIO Stale Data bug enumeration. A microcode update adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
|
#
089be16d |
|
20-Apr-2022 |
Sandipan Das <sandipan.das@amd.com> |
x86/msr: Add PerfCntrGlobal* registers Add MSR definitions that will be used to enable the new AMD Performance Monitoring Version 2 (PerfMonV2) features. These include: * Performance Counter Global Control (PerfCntrGlobalCtl) * Performance Counter Global Status (PerfCntrGlobalStatus) * Performance Counter Global Status Clear (PerfCntrGlobalStatusClr) The new Performance Counter Global Control and Status MSRs provide an interface for enabling or disabling multiple counters at the same time and for testing overflow without probing the individual registers for each PMC. The availability of these registers is indicated through the PerfMonV2 feature bit of CPUID leaf 0x80000022 EAX. Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/cdc0d8f75bd519848731b5c64d924f5a0619a573.1650515382.git.sandipan.das@amd.com
|
#
f52ba931 |
|
20-Aug-2021 |
Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> |
tools/power turbostat: Add Power Limit4 support Add Power Limit4 support. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
|
#
ada54345 |
|
22-Mar-2022 |
Stephane Eranian <eranian@google.com> |
perf/x86/amd: Add AMD Fam19h Branch Sampling support Add support for the AMD Fam19h 16-deep branch sampling feature as described in the AMD PPR Fam19h Model 01h Revision B1. This is a model specific extension. It is not an architected AMD feature. The Branch Sampling (BRS) operates with a 16-deep saturating buffer in MSR registers. There is no branch type filtering. All control flow changes are captured. BRS relies on specific programming of the core PMU of Fam19h. In particular, the following requirements must be met: - the sampling period be greater than 16 (BRS depth) - the sampling period must use a fixed and not frequency mode BRS interacts with the NMI interrupt as well. Because enabling BRS is expensive, it is only activated after P event occurrences, where P is the desired sampling period. At P occurrences of the event, the counter overflows, the CPU catches the interrupt, activates BRS for 16 branches until it saturates, and then delivers the NMI to the kernel. Between the overflow and the time BRS activates more branches may be executed skewing the period. All along, the sampling event keeps counting. The skid may be attenuated by reducing the sampling period by 16 (subsequent patch). BRS is integrated into perf_events seamlessly via the same PERF_RECORD_BRANCH_STACK sample format. BRS generates perf_branch_entry records in the sampling buffer. No prediction information is supported. The branches are stored in reverse order of execution. The most recent branch is the first entry in each record. No modification to the perf tool is necessary. BRS can be used with any sampling event. However, it is recommended to use the RETIRED_BRANCH_INSTRUCTIONS event because it matches what the BRS captures. $ perf record -b -c 1000037 -e cpu/event=0xc2,name=ret_br_instructions/ test $ perf report -D 56531696056126 0x193c000 [0x1a8]: PERF_RECORD_SAMPLE(IP, 0x2): 18122/18230: 0x401d24 period: 1000037 addr: 0 ... branch stack: nr:16 ..... 0: 0000000000401d24 -> 0000000000401d5a 0 cycles 0 ..... 1: 0000000000401d5c -> 0000000000401d24 0 cycles 0 ..... 2: 0000000000401d22 -> 0000000000401d5c 0 cycles 0 ..... 3: 0000000000401d5e -> 0000000000401d22 0 cycles 0 ..... 4: 0000000000401d20 -> 0000000000401d5e 0 cycles 0 ..... 5: 0000000000401d3e -> 0000000000401d20 0 cycles 0 ..... 6: 0000000000401d42 -> 0000000000401d3e 0 cycles 0 ..... 7: 0000000000401d3c -> 0000000000401d42 0 cycles 0 ..... 8: 0000000000401d44 -> 0000000000401d3c 0 cycles 0 ..... 9: 0000000000401d3a -> 0000000000401d44 0 cycles 0 ..... 10: 0000000000401d46 -> 0000000000401d3a 0 cycles 0 ..... 11: 0000000000401d38 -> 0000000000401d46 0 cycles 0 ..... 12: 0000000000401d48 -> 0000000000401d38 0 cycles 0 ..... 13: 0000000000401d36 -> 0000000000401d48 0 cycles 0 ..... 14: 0000000000401d4a -> 0000000000401d36 0 cycles 0 ..... 15: 0000000000401d34 -> 0000000000401d4a 0 cycles 0 ... thread: test:18230 ...... dso: test Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220322221517.2510440-4-eranian@google.com
|
#
db1af129 |
|
06-May-2022 |
Tony Luck <tony.luck@intel.com> |
x86/msr-index: Define INTEGRITY_CAPABILITIES MSR The INTEGRITY_CAPABILITIES MSR is enumerated by bit 2 of the CORE_CAPABILITIES MSR. Add defines for the CORE_CAPS enumeration as well as for the integrity MSR. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220506225410.1652287-3-tony.luck@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
#
f742b90e |
|
24-Feb-2022 |
Brijesh Singh <brijesh.singh@amd.com> |
x86/mm: Extend cc_attr to include AMD SEV-SNP The CC_ATTR_GUEST_SEV_SNP can be used by the guest to query whether the SNP (Secure Nested Paging) feature is active. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220307213356.2797205-10-brijesh.singh@amd.com
|
#
400331f8 |
|
10-Mar-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/tsx: Disable TSX development mode at boot A microcode update on some Intel processors causes all TSX transactions to always abort by default[*]. Microcode also added functionality to re-enable TSX for development purposes. With this microcode loaded, if tsx=on was passed on the cmdline, and TSX development mode was already enabled before the kernel boot, it may make the system vulnerable to TSX Asynchronous Abort (TAA). To be on safer side, unconditionally disable TSX development mode during boot. If a viable use case appears, this can be revisited later. [*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557 [ bp: Drop unstable web link, massage heavily. ] Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com
|
#
991625f3 |
|
08-Mar-2022 |
Peter Zijlstra <peterz@infradead.org> |
x86/ibt: Add IBT feature, MSR and #CP handling The bits required to make the hardware go.. Of note is that, provided the syscall entry points are covered with ENDBR, #CP doesn't need to be an IST because we'll never hit the syscall gap. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.582331711@infradead.org
|
#
161a9a33 |
|
25-Jan-2022 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel/pt: Add a capability and config bit for disabling TNTs As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new Intel PT feature called TNT-Disable which is enabled config bit 55. TNT-Disable disables Taken-Not-Taken packets to reduce the tracing overhead, but with the result that exact control flow information is lost. Add a capability and config bit for TNT-Disable. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20220126104815.2807416-3-adrian.hunter@intel.com
|
#
28c24ded |
|
25-Jan-2022 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel/pt: Add a capability and config bit for event tracing As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new Intel PT feature called Event Trace which is enabled config bit 31. Event Trace exposes details about asynchronous events such as interrupts and VM-Entry/Exit. Add a capability and config bit for Event Trace. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20220126104815.2807416-2-adrian.hunter@intel.com
|
#
7b8f40b3 |
|
27-Jan-2022 |
Ricardo Neri <ricardo.neri-calderon@linux.intel.com> |
x86/cpu: Add definitions for the Intel Hardware Feedback Interface Add the CPUID feature bit and the model-specific registers needed to identify and configure the Intel Hardware Feedback Interface. Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
39150352 |
|
07-Feb-2022 |
Maxim Levitsky <mlevitsk@redhat.com> |
KVM: x86: SVM: move avic definitions from AMD's spec to svm.h asm/svm.h is the correct place for all values that are defined in the SVM spec, and that includes AVIC. Also add some values from the spec that were not defined before and will be soon useful. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220207155447.840194-10-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
89aa94b4 |
|
23-Dec-2021 |
Huang Rui <ray.huang@amd.com> |
x86/msr: Add AMD CPPC MSR definitions AMD CPPC (Collaborative Processor Performance Control) function uses MSR registers to manage the performance hints. So add the MSR register macro here. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
dae1bd58 |
|
21-Oct-2021 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/msr-index: Add MSRs for XFD XFD introduces two MSRs: - IA32_XFD to enable/disable a feature controlled by XFD - IA32_XFD_ERR to expose to the #NM trap handler which feature was tried to be used for the first time. Both use the same xstate-component bitmap format, used by XCR0. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-14-chang.seok.bae@intel.com
|
#
1348924b |
|
14-Jun-2021 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/msr: Define new bits in TSX_FORCE_ABORT MSR Intel client processors that support the IA32_TSX_FORCE_ABORT MSR related to perf counter interaction [1] received a microcode update that deprecates the Transactional Synchronization Extension (TSX) feature. The bit FORCE_ABORT_RTM now defaults to 1, writes to this bit are ignored. A new bit TSX_CPUID_CLEAR clears the TSX related CPUID bits. The summary of changes to the IA32_TSX_FORCE_ABORT MSR are: Bit 0: FORCE_ABORT_RTM (legacy bit, new default=1) Status bit that indicates if RTM transactions are always aborted. This bit is essentially !SDV_ENABLE_RTM(Bit 2). Writes to this bit are ignored. Bit 1: TSX_CPUID_CLEAR (new bit, default=0) When set, CPUID.HLE = 0 and CPUID.RTM = 0. Bit 2: SDV_ENABLE_RTM (new bit, default=0) When clear, XBEGIN will always abort with EAX code 0. When set, XBEGIN will not be forced to abort (but will always abort in SGX enclaves). This bit is intended to be used on developer systems. If this bit is set, transactional atomicity correctness is not certain. SDV = Software Development Vehicle (SDV), i.e. developer systems. Performance monitoring counter 3 is usable in all cases, regardless of the value of above bits. Add support for a new CPUID bit - CPUID.RTM_ALWAYS_ABORT (CPUID 7.EDX[11]) - to indicate the status of always abort behavior. [1] [ bp: Look for document ID 604224, "Performance Monitoring Impact of Intel Transactional Synchronization Extension Memory". Since there's no way for us to have stable links to documents... ] [ bp: Massage and extend commit message. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Link: https://lkml.kernel.org/r/9add61915b4a4eedad74fbd869107863a28b428e.1623704845.git-series.pawan.kumar.gupta@linux.intel.com
|
#
059e5c32 |
|
27-Apr-2021 |
Brijesh Singh <brijesh.singh@amd.com> |
x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG The SYSCFG MSR continued being updated beyond the K8 family; drop the K8 name from it. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210427111636.1207-4-brijesh.singh@amd.com
|
#
d0946a88 |
|
12-Apr-2021 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86/intel: Hybrid PMU support for perf capabilities Some platforms, e.g. Alder Lake, have hybrid architecture. Although most PMU capabilities are the same, there are still some unique PMU capabilities for different hybrid PMUs. Perf should register a dedicated pmu for each hybrid PMU. Add a new struct x86_hybrid_pmu, which saves the dedicated pmu and capabilities for each hybrid PMU. The architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicates the architecture features which are available on all hybrid PMUs. The architecture features are stored in the global x86_pmu.intel_cap. For Alder Lake, the model-specific features are perf metrics and PEBS-via-PT. The corresponding bits of the global x86_pmu.intel_cap should be 0 for these two features. Perf should not use the global intel_cap to check the features on a hybrid system. Add a dedicated intel_cap in the x86_hybrid_pmu to store the model-specific capabilities. Use the dedicated intel_cap to replace the global intel_cap for thse two features. The dedicated intel_cap will be set in the following "Add Alder Lake Hybrid support" patch. Add is_hybrid() to distinguish a hybrid system. ADL may have an alternative configuration. With that configuration, the X86_FEATURE_HYBRID_CPU is not set. Perf cannot rely on the feature bit. Add a new static_key_false, perf_is_hybrid, to indicate a hybrid system. It will be assigned in the following "Add Alder Lake Hybrid support" patch as well. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-5-git-send-email-kan.liang@linux.intel.com
|
#
ebb1064e |
|
22-Mar-2021 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/traps: Handle #DB for bus lock Bus locks degrade performance for the whole system, not just for the CPU that requested the bus lock. Two CPU features "#AC for split lock" and "#DB for bus lock" provide hooks so that the operating system may choose one of several mitigation strategies. #AC for split lock is already implemented. Add code to use the #DB for bus lock feature to cover additional situations with new options to mitigate. split_lock_detect= #AC for split lock #DB for bus lock off Do nothing Do nothing warn Kernel OOPs Warn once per task and Warn once per task and and continues to run. disable future checking When both features are supported, warn in #AC fatal Kernel OOPs Send SIGBUS to user. Send SIGBUS to user When both features are supported, fatal in #AC ratelimit:N Do nothing Limit bus lock rate to N per second in the current non-root user. Default option is "warn". Hardware only generates #DB for bus lock detect when CPL>0 to avoid nested #DB from multiple bus locks while the first #DB is being handled. So no need to handle #DB for bus lock detected in the kernel. #DB for bus lock is enabled by bus lock detection bit 2 in DEBUGCTL MSR while #AC for split lock is enabled by split lock detection bit 29 in TEST_CTRL MSR. Both breakpoint and bus lock in the same instruction can trigger one #DB. The bus lock is handled before the breakpoint in the #DB handler. Delivery of #DB for bus lock in userspace clears DR6[11], which is set by the #DB handler right after reading DR6. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210322135325.682257-3-fenghua.yu@intel.com
|
#
09141ec0 |
|
05-Mar-2020 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86: Remove duplicate TSC DEADLINE MSR definitions There are two definitions for the TSC deadline MSR in msr-index.h, one with an underscore and one without. Axe one of them and move all the references over to the other one. [ bp: Fixup the MSR define in handle_fastpath_set_msr_irqoff() too. ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200305174706.0D6B8EE4@viggo.jf.intel.com
|
#
69372cf0 |
|
10-Dec-2020 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/cpu: Add VM page flush MSR availablility as a CPUID feature On systems that do not have hardware enforced cache coherency between encrypted and unencrypted mappings of the same physical page, the hypervisor can use the VM page flush MSR (0xc001011e) to flush the cache contents of an SEV guest page. When a small number of pages are being flushed, this can be used in place of issuing a WBINVD across all CPUs. CPUID 0x8000001f_eax[2] is used to determine if the VM page flush MSR is available. Add a CPUID feature to indicate it is supported and define the MSR. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <f1966379e31f9b208db5257509c4a089a87d33d0.1607620209.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
d205e0f1 |
|
12-Nov-2020 |
Sean Christopherson <seanjc@google.com> |
x86/{cpufeatures,msr}: Add Intel SGX Launch Control hardware bits The SGX Launch Control hardware helps restrict which enclaves the hardware will run. Launch control is intended to restrict what software can run with enclave protections, which helps protect the overall system from bad enclaves. For the kernel's purposes, there are effectively two modes in which the launch control hardware can operate: rigid and flexible. In its rigid mode, an entity other than the kernel has ultimate authority over which enclaves can be run (firmware, Intel, etc...). In its flexible mode, the kernel has ultimate authority over which enclaves can run. Enable X86_FEATURE_SGX_LC to enumerate when the CPU supports SGX Launch Control in general. Add MSR_IA32_SGXLEPUBKEYHASH{0, 1, 2, 3}, which when combined contain a SHA256 hash of a 3072-bit RSA public key. The hardware allows SGX enclaves signed with this public key to initialize and run [*]. Enclaves not signed with this key can not initialize and run. Add FEAT_CTL_SGX_LC_ENABLED, which informs whether the SGXLEPUBKEYHASH MSRs can be written by the kernel. If the MSRs do not exist or are read-only, the launch control hardware is operating in rigid mode. Linux does not and will not support creating enclaves when hardware is configured in rigid mode because it takes away the authority for launch decisions from the kernel. Note, this does not preclude KVM from virtualizing/exposing SGX to a KVM guest when launch control hardware is operating in rigid mode. [*] Intel SDM: 38.1.4 Intel SGX Launch Control Configuration Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jethro Beekman <jethro@fortanix.com> Link: https://lkml.kernel.org/r/20201112220135.165028-5-jarkko@kernel.org
|
#
e7b6385b |
|
12-Nov-2020 |
Sean Christopherson <seanjc@google.com> |
x86/cpufeatures: Add Intel SGX hardware bits Populate X86_FEATURE_SGX feature from CPUID and tie it to the Kconfig option with disabled-features.h. IA32_FEATURE_CONTROL.SGX_ENABLE must be examined in addition to the CPUID bits to enable full SGX support. The BIOS must both set this bit and lock IA32_FEATURE_CONTROL for SGX to be supported (Intel SDM section 36.7.1). The setting or clearing of this bit has no impact on the CPUID bits above, which is why it needs to be detected separately. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jethro Beekman <jethro@fortanix.com> Link: https://lkml.kernel.org/r/20201112220135.165028-4-jarkko@kernel.org
|
#
43756a29 |
|
27-Oct-2020 |
Victor Ding <victording@google.com> |
powercap: Add AMD Fam17h RAPL support Enable AMD Fam17h RAPL support for the power capping framework. The support is as per AMD Fam17h Model31h (Zen2) and model 00-ffh (Zen1) PPR. Tested by comparing the results of following two sysfs entries and the values directly read from corresponding MSRs via /dev/cpu/[x]/msr: /sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj /sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:0/energy_uj Signed-off-by: Victor Ding <victording@google.com> Acked-by: Kim Phillips <kim.phillips@amd.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
298ed2b3 |
|
27-Oct-2020 |
Victor Ding <victording@google.com> |
x86/msr-index: sort AMD RAPL MSRs by address MSRs in the rest of this file are sorted by their addresses; fixing the two outliers. No functional changes. Signed-off-by: Victor Ding <victording@google.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
68299a42 |
|
30-Oct-2020 |
Tony Luck <tony.luck@intel.com> |
x86/mce: Enable additional error logging on certain Intel CPUs The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an optional additional error logging mode which is enabled by an MSR. Previously, this mode was enabled from the mcelog(8) tool via /dev/cpu, but userspace should not be poking at MSRs. So move the enabling into the kernel. [ bp: Correct the explanation why this is done. ] Suggested-by: Boris Petkov <bp@alien8.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201030190807.GA13884@agluck-desk2.amr.corp.intel.com
|
#
f0f2f9fe |
|
15-Sep-2020 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/msr-index: Define an IA32_PASID MSR The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier (PASID), a 20-bit value. Bit 31 must be set to indicate the value programmed in the MSR is valid. Hardware uses the PASID to identify a process address space and direct responses to the right address space. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-7-git-send-email-fenghua.yu@intel.com
|
#
36e1be8a |
|
08-Sep-2020 |
Kim Phillips <kim.phillips@amd.com> |
perf/x86/amd/ibs: Fix raw sample data accumulation Neither IbsBrTarget nor OPDATA4 are populated in IBS Fetch mode. Don't accumulate them into raw sample user data in that case. Also, in Fetch mode, add saving the IBS Fetch Control Extended MSR. Technically, there is an ABI change here with respect to the IBS raw sample data format, but I don't see any perf driver version information being included in perf.data file headers, but, existing users can detect whether the size of the sample record has reduced by 8 bytes to determine whether the IBS driver has this fix. Fixes: 904cb3677f3a ("perf/x86/amd/ibs: Update IBS MSRs and feature definitions") Reported-by: Stephane Eranian <stephane.eranian@google.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200908214740.18097-6-kim.phillips@amd.com
|
#
b57de6cd |
|
07-Sep-2020 |
Joerg Roedel <jroedel@suse.de> |
x86/sev-es: Add SEV-ES Feature Detection Add a sev_es_active() function for checking whether SEV-ES is enabled. Also cache the value of MSR_AMD64_SEV at boot to speed up the feature checking in the running code. [ bp: Remove "!!" in sev_active() too. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-37-joro@8bytes.org
|
#
29dcc60f |
|
07-Sep-2020 |
Joerg Roedel <jroedel@suse.de> |
x86/boot/compressed/64: Add stage1 #VC handler Add the first handler for #VC exceptions. At stage 1 there is no GHCB yet because the kernel might still be running on the EFI page table. The stage 1 handler is limited to the MSR-based protocol to talk to the hypervisor and can only support CPUID exit-codes, but that is enough to get to stage 2. [ bp: Zap superfluous newlines after rd/wrmsr instruction mnemonics. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-20-joro@8bytes.org
|
#
59a854e2 |
|
23-Jul-2020 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86/intel: Support TopDown metrics on Ice Lake Ice Lake supports the hardware TopDown metrics feature, which can free up the scarce GP counters. Update the event constraints for the metrics events. The metric counters do not exist, which are mapped to a dummy offset. The sharing between multiple users of the same metric without multiplexing is not allowed. Implement set_topdown_event_period for Ice Lake. The values in PERF_METRICS MSR are derived from the fixed counter 3. Both registers should start from zero. Implement update_topdown_event for Ice Lake. The metric is reported by multiplying the metric (fraction) with slots. To maintain accurate measurements, both registers are cleared for each update. The fixed counter 3 should always be cleared before the PERF_METRICS. Implement td_attr for the new metrics events and the new slots fixed counter. Make them visible to the perf user tools. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200723171117.9918-11-kan.liang@linux.intel.com
|
#
7b2c05a1 |
|
23-Jul-2020 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86/intel: Generic support for hardware TopDown metrics Intro ===== The TopDown Microarchitecture Analysis (TMA) Method is a structured analysis methodology to identify critical performance bottlenecks in out-of-order processors. Current perf has supported the method. The method works well, but there is one problem. To collect the TopDown events, several GP counters have to be used. If a user wants to collect other events at the same time, the multiplexing probably be triggered, which impacts the accuracy. To free up the scarce GP counters, the hardware TopDown metrics feature is introduced from Ice Lake. The hardware implements an additional "metrics" register and a new Fixed Counter 3 that measures pipeline "slots". The TopDown events can be calculated from them instead. Events ====== The level 1 TopDown has four metrics. There is no event-code assigned to the TopDown metrics. Four metric events are exported as separate perf events, which map to the internal "metrics" counter register. Those events do not exist in hardware, but can be allocated by the scheduler. For the event mapping, a special 0x00 event code is used, which is reserved for fake events. The metric events start from umask 0x10. When setting up the metric events, they point to the Fixed Counter 3. They have to be specially handled. - Add the update_topdown_event() callback to read the additional metrics MSR and generate the metrics. - Add the set_topdown_event_period() callback to initialize metrics MSR and the fixed counter 3. - Add a variable n_metric_event to track the number of the accepted metrics events. The sharing between multiple users of the same metric without multiplexing is not allowed. - Only enable/disable the fixed counter 3 when there are no other active TopDown events, which avoid the unnecessary writing of the fixed control register. - Disable the PMU when reading the metrics event. The metrics MSR and the fixed counter 3 are read separately. The values may be modified by an NMI. All four metric events don't support sampling. Since they will be handled specially for event update, a flag PERF_X86_EVENT_TOPDOWN is introduced to indicate this case. The slots event can support both sampling and counting. For counting, the flag is also applied. For sampling, it will be handled normally as other normal events. Groups ====== The slots event is required in a Topdown group. To avoid reading the METRICS register multiple times, the metrics and slots value can only be updated by slots event in a group. All active slots and metrics events will be updated one time. Therefore, the slots event must be before any metric events in a Topdown group. NMI ====== The METRICS related register may be overflow. The bit 48 of the STATUS register will be set. If so, PERF_METRICS and Fixed counter 3 are required to be reset. The patch also update all active slots and metrics events in the NMI handler. The update_topdown_event() has to read two registers separately. The values may be modified by an NMI. PMU has to be disabled before calling the function. RDPMC ====== RDPMC is temporarily disabled. A later patch will enable it. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200723171117.9918-9-kan.liang@linux.intel.com
|
#
d6a162a4 |
|
03-Jul-2020 |
Kan Liang <kan.liang@linux.intel.com> |
x86/msr-index: Add bunch of MSRs for Arch LBR Add Arch LBR related MSRs and the new LBR INFO bits in MSR-index. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-8-git-send-email-kan.liang@linux.intel.com
|
#
ed7bde7a |
|
26-Jun-2020 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
cpufreq: intel_pstate: Allow enable/disable energy efficiency By default intel_pstate the driver disables energy efficiency by setting MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode. This CPU model is also shared by Coffee Lake desktop CPUs. This allows these systems to reach maximum possible frequency. But this adds power penalty, which some customers don't want. They want some way to enable/ disable dynamically. So, add an additional attribute "energy_efficiency" under /sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows to read and write bit 19 ("Disable Energy Efficiency Optimization") in the MSR IA32_POWER_CTL. This attribute is present in both HWP and non-HWP mode as this has an effect in both modes. Refer to Intel Software Developer's manual for details. The scope of this bit is package wide. Also these systems are single package systems. So read/write MSR on the current CPU is enough. The energy efficiency (EE) bit setting needs to be preserved during suspend/resume and CPU offline/online operation. To do this: - Restoring the EE setting from the cpufreq resume() callback, if there is change from the system default. - By default, don't disable EE from cpufreq init() callback for matching CPU models. Since the scope is package wide and is a single package system, move the disable EE calls from init() callback to intel_pstate_init() function, which is called only once. Suggested-by: Len Brown <lenb@kernel.org> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
99e40204 |
|
20-Jun-2020 |
Borislav Petkov <bp@suse.de> |
x86/msr: Move the F15h MSRs where they belong 1068ed4547ad ("x86/msr: Lift AMD family 0x15 power-specific MSRs") moved the three F15h power MSRs to the architectural list but that was wrong as they belong in the family 0x15 list. That also caused: In file included from trace/beauty/tracepoints/x86_msr.c:10: perf/trace/beauty/generated/x86_arch_MSRs_array.c:292:45: error: initialized field overwritten [-Werror=override-init] 292 | [0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC", | ^~~~~~~~~~~ perf/trace/beauty/generated/x86_arch_MSRs_array.c:292:45: note: (near initialization for 'x86_AMD_V_KVM_MSRs[640]') due to MSR_F15H_PTSC ending up being defined twice. Move them where they belong and drop the duplicate. Also, drop the respective tools/ changes of the msr-index.h copy the above commit added because perf tool developers prefer to go through those changes themselves in order to figure out whether changes to the kernel headers would need additional handling in perf. Fixes: 1068ed4547ad ("x86/msr: Lift AMD family 0x15 power-specific MSRs") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lkml.kernel.org/r/20200621163323.14e8533f@canb.auug.org.au
|
#
1068ed45 |
|
08-Jun-2020 |
Borislav Petkov <bp@suse.de> |
x86/msr: Lift AMD family 0x15 power-specific MSRs ... into the global msr-index.h header because they're used in multiple compilation units. Sort the MSR list a bit. Update the msr-index.h copy in tools. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lkml.kernel.org/r/20200608164847.14232-1-bp@alien8.de
|
#
5cde2653 |
|
27-May-2020 |
Stephane Eranian <eranian@google.com> |
perf/x86/rapl: Add AMD Fam17h RAPL support This patch enables AMD Fam17h RAPL support for the Package level metric. The support is as per AMD Fam17h Model31h (Zen2) and model 00-ffh (Zen1) PPR. The same output is available via the energy-pkg pseudo event: $ perf stat -a -I 1000 --per-socket -e power/energy-pkg/ Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200527224659.206129-6-eranian@google.com
|
#
7e5b3c26 |
|
16-Apr-2020 |
Mark Gross <mgross@linux.intel.com> |
x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation SRBDS is an MDS-like speculative side channel that can leak bits from the random number generator (RNG) across cores and threads. New microcode serializes the processor access during the execution of RDRAND and RDSEED. This ensures that the shared buffer is overwritten before it is released for reuse. While it is present on all affected CPU models, the microcode mitigation is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the cases where TSX is not supported or has been disabled with TSX_CTRL. The mitigation is activated by default on affected processors and it increases latency for RDRAND and RDSEED instructions. Among other effects this will reduce throughput from /dev/urandom. * Enable administrator to configure the mitigation off when desired using either mitigations=off or srbds=off. * Export vulnerability status via sysfs * Rename file-scoped macros to apply for non-whitelist table initializations. [ bp: Massage, - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g, - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in, - flip check in cpu_set_bug_bits() to save an indentation level, - reflow comments. jpoimboe: s/Mitigated/Mitigation/ in user-visible strings tglx: Dropped the fused off magic for now ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
|
#
6650cdd9 |
|
26-Jan-2020 |
Peter Zijlstra (Intel) <peterz@infradead.org> |
x86/split_lock: Enable split lock detection by kernel A split-lock occurs when an atomic instruction operates on data that spans two cache lines. In order to maintain atomicity the core takes a global bus lock. This is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). For real-time systems this may mean missing deadlines. For other systems it may just be very annoying. Some CPUs have the capability to raise an #AC trap when a split lock is attempted. Provide a command line option to give the user choices on how to handle this: split_lock_detect= off - not enabled (no traps for split locks) warn - warn once when an application does a split lock, but allow it to continue running. fatal - Send SIGBUS to applications that cause split lock On systems that support split lock detection the default is "warn". Note that if the kernel hits a split lock in any mode other than "off" it will OOPs. One implementation wrinkle is that the MSR to control the split lock detection is per-core, not per thread. This might result in some short lived races on HT systems in "warn" mode if Linux tries to enable on one thread while disabling on the other. Race analysis by Sean Christopherson: - Toggling of split-lock is only done in "warn" mode. Worst case scenario of a race is that a misbehaving task will generate multiple #AC exceptions on the same instruction. And this race will only occur if both siblings are running tasks that generate split-lock #ACs, e.g. a race where sibling threads are writing different values will only occur if CPUx is disabling split-lock after an #AC and CPUy is re-enabling split-lock after *its* previous task generated an #AC. - Transitioning between off/warn/fatal modes at runtime isn't supported and disabling is tracked per task, so hardware will always reach a steady state that matches the configured mode. I.e. split-lock is guaranteed to be enabled in hardware once all _TIF_SLD threads have been scheduled out. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200126200535.GB30377@agluck-desk2.amr.corp.intel.com
|
#
21b5ee59 |
|
19-Feb-2020 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF Commit aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter") added support for access to the free-running counter via 'perf -e msr/irperf/', but when exercised, it always returns a 0 count: BEFORE: $ perf stat -e instructions,msr/irperf/ true Performance counter stats for 'true': 624,833 instructions 0 msr/irperf/ Simply set its enable bit - HWCR bit 30 - to make it start counting. Enablement is restricted to all machines advertising IRPERF capability, except those susceptible to an erratum that makes the IRPERF return bad values. That erratum occurs in Family 17h models 00-1fh [1], but not in F17h models 20h and above [2]. AFTER (on a family 17h model 31h machine): $ perf stat -e instructions,msr/irperf/ true Performance counter stats for 'true': 621,690 instructions 622,490 msr/irperf/ [1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors [2] Revision Guide for AMD Family 17h Models 30h-3Fh Processors The revision guides are available from the bugzilla Link below. [ bp: Massage commit message. ] Fixes: aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Link: http://lkml.kernel.org/r/20200214201805.13830-1-kim.phillips@amd.com
|
#
32ad73db |
|
20-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
x86/msr-index: Clean up bit defines for IA32_FEATURE_CONTROL MSR As pointed out by Boris, the defines for bits in IA32_FEATURE_CONTROL are quite a mouthful, especially the VMX bits which must differentiate between enabling VMX inside and outside SMX (TXT) operation. Rename the MSR and its bit defines to abbreviate FEATURE_CONTROL as FEAT_CTL to make them a little friendlier on the eyes. Arguably, the MSR itself should keep the full IA32_FEATURE_CONTROL name to match Intel's SDM, but a future patch will add a dedicated Kconfig, file and functions for the MSR. Using the full name for those assets is rather unwieldy, so bite the bullet and use IA32_FEAT_CTL so that its nomenclature is consistent throughout the kernel. Opportunistically, fix a few other annoyances with the defines: - Relocate the bit defines so that they immediately follow the MSR define, e.g. aren't mistaken as belonging to MISC_FEATURE_CONTROL. - Add whitespace around the block of feature control defines to make it clear they're all related. - Use BIT() instead of manually encoding the bit shift. - Use "VMX" instead of "VMXON" to match the SDM. - Append "_ENABLED" to the LMCE (Local Machine Check Exception) bit to be consistent with the kernel's verbiage used for all other feature control bits. Note, the SDM refers to the LMCE bit as LMCE_ON, likely to differentiate it from IA32_MCG_EXT_CTL.LMCE_EN. Ignore the (literal) one-off usage of _ON, the SDM is simply "wrong". Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191221044513.21680-2-sean.j.christopherson@intel.com
|
#
4e3f77d8 |
|
11-Nov-2019 |
Jan Beulich <jbeulich@suse.com> |
xen/mcelog: add PPIN to record when available This is to augment commit 3f5a7896a5 ("x86/mce: Include the PPIN in MCE records when available"). I'm also adding "synd" and "ipid" fields to struct xen_mce, in an attempt to keep field offsets in sync with struct mce. These two fields won't get populated for now, though. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
|
#
db4d30fb |
|
03-Nov-2019 |
Vineela Tummalapalli <vineela.tummalapalli@intel.com> |
x86/bugs: Add ITLB_MULTIHIT bug infrastructure Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit in the instruction TLB. This can occur when the page size is changed along with either the physical address or cache type. The relevant erratum can be found here: https://bugzilla.kernel.org/show_bug.cgi?id=205195 There are other processors affected for which the erratum does not fully disclose the impact. This issue affects both bare-metal x86 page tables and EPT. It can be mitigated by either eliminating the use of large pages or by using careful TLB invalidations when changing the page size in the page tables. Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which are mitigated against this issue. Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
1b42f017 |
|
23-Oct-2019 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/speculation/taa: Add mitigation for TSX Async Abort TSX Async Abort (TAA) is a side channel vulnerability to the internal buffers in some Intel processors similar to Microachitectural Data Sampling (MDS). In this case, certain loads may speculatively pass invalid data to dependent operations when an asynchronous abort condition is pending in a TSX transaction. This includes loads with no fault or assist condition. Such loads may speculatively expose stale data from the uarch data structures as in MDS. Scope of exposure is within the same-thread and cross-thread. This issue affects all current processors that support TSX, but do not have ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES. On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0, CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers using VERW or L1D_FLUSH, there is no additional mitigation needed for TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by disabling the Transactional Synchronization Extensions (TSX) feature. A new MSR IA32_TSX_CTRL in future and current processors after a microcode update can be used to control the TSX feature. There are two bits in that MSR: * TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted Transactional Memory (RTM). * TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally disabled with updated microcode but still enumerated as present by CPUID(EAX=7).EBX{bit4}. The second mitigation approach is similar to MDS which is clearing the affected CPU buffers on return to user space and when entering a guest. Relevant microcode update is required for the mitigation to work. More details on this approach can be found here: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html The TSX feature can be controlled by the "tsx" command line parameter. If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is deployed. The effective mitigation state can be read from sysfs. [ bp: - massage + comments cleanup - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh. - remove partial TAA mitigation in update_mds_branch_idle() - Josh. - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
c2955f27 |
|
23-Oct-2019 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/msr: Add the IA32_TSX_CTRL MSR Transactional Synchronization Extensions (TSX) may be used on certain processors as part of a speculative side channel attack. A microcode update for existing processors that are vulnerable to this attack will add a new MSR - IA32_TSX_CTRL to allow the system administrator the option to disable TSX as one of the possible mitigations. The CPUs which get this new MSR after a microcode upgrade are the ones which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all CPU buffers takes care of the TAA case as well. [ Note that future processors that are not vulnerable will also support the IA32_TSX_CTRL MSR. ] Add defines for the new IA32_TSX_CTRL MSR and its bits. TSX has two sub-features: 1. Restricted Transactional Memory (RTM) is an explicitly-used feature where new instructions begin and end TSX transactions. 2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of "old" style locks are used by software. Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the IA32_TSX_CTRL MSR. There are two control bits in IA32_TSX_CTRL MSR: Bit 0: When set, it disables the Restricted Transactional Memory (RTM) sub-feature of TSX (will force all transactions to abort on the XBEGIN instruction). Bit 1: When set, it disables the enumeration of the RTM and HLE feature (i.e. it will make CPUID(EAX=7).EBX{bit4} and CPUID(EAX=7).EBX{bit11} read as 0). The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally disabled by the new microcode but still enumerated as present by CPUID(EAX=7).EBX{bit4}, unless disabled by IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
42880f72 |
|
06-Aug-2019 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel: Support PEBS output to PT If PEBS declares ability to output its data to Intel PT stream, use the aux_output attribute bit to enable PEBS data output to PT. This requires a PT event to be present and scheduled in the same context. Unlike the DS area, the kernel does not extract PEBS records from the PT stream to generate corresponding records in the perf stream, because that would require real time in-kernel PT decoding, which is not feasible. The PMI, however, can still be used. The output setting is per-CPU, so all PEBS events must be either writing to PT or to the DS area, therefore, in case of conflict, the conflicting event will fail to schedule, allowing the rotation logic to alternate between the PEBS->PT and PEBS->DS events. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: kan.liang@linux.intel.com Link: https://lkml.kernel.org/r/20190806084606.4021-3-alexander.shishkin@linux.intel.com
|
#
c49a0a80 |
|
19-Aug-2019 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "rdrand=force" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Additionally, update the suspend and resume path to save and restore the MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in place after resuming from suspend. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chen Yu <yu.c.chen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org> Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
|
#
342061c5 |
|
19-Aug-2019 |
Borislav Petkov <bp@suse.de> |
x86/msr-index: Move AMD MSRs where they belong ... sort them in and fixup comment, while at it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190819070140.23708-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
bd688c69 |
|
19-Jun-2019 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/umwait: Initialize umwait control values umwait or tpause allows the processor to enter a light-weight power/performance optimized state (C0.1 state) or an improved power/performance optimized state (C0.2 state) for a period specified by the instruction or until the system time limit or until a store to the monitored address range in umwait. IA32_UMWAIT_CONTROL MSR register allows the OS to enable/disable C0.2 on the processor and to set the maximum time the processor can reside in C0.1 or C0.2. By default C0.2 is enabled so the user wait instructions can enter the C0.2 state to save more power with slower wakeup time. Andy Lutomirski proposed to set the maximum umwait time to 100000 cycles by default. A quote from Andy: "What I want to avoid is the case where it works dramatically differently on NO_HZ_FULL systems as compared to everything else. Also, UMWAIT may behave a bit differently if the max timeout is hit, and I'd like that path to get exercised widely by making it happen even on default configs." A sysfs interface to adjust the time and the C0.2 enablement is provided in a follow up change. [ tglx: Renamed MSR_IA32_UMWAIT_CONTROL_MAX_TIME to MSR_IA32_UMWAIT_CONTROL_TIME_MASK because the constant is used as mask throughout the code. Massaged comments and changelog ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: "Borislav Petkov" <bp@alien8.de> Cc: "H Peter Anvin" <hpa@zytor.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Tony Luck" <tony.luck@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1560994438-235698-3-git-send-email-fenghua.yu@intel.com
|
#
c715eb9f |
|
18-Feb-2019 |
Luwei Kang <luwei.kang@intel.com> |
KVM: x86: Add support of clear Trace_ToPA_PMI status Let guests clear the Intel PT ToPA PMI status (bit 55 of MSR_CORE_PERF_GLOBAL_OVF_CTRL). Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
8479e04e |
|
18-Feb-2019 |
Luwei Kang <luwei.kang@intel.com> |
KVM: x86: Inject PMI for KVM guest Inject a PMI for KVM guest when Intel PT working in Host-Guest mode and Guest ToPA entry memory buffer was completely filled. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
c22497f5 |
|
02-Apr-2019 |
Kan Liang <kan.liang@linux.intel.com> |
perf/x86/intel: Support adaptive PEBS v4 Adaptive PEBS is a new way to report PEBS sampling information. Instead of a fixed size record for all PEBS events it allows to configure the PEBS record to only include the information needed. Events can then opt in to use such an extended record, or stay with a basic record which only contains the IP. The major new feature is to support LBRs in PEBS record. Besides normal LBR, this allows (much faster) large PEBS, while still supporting callstacks through callstack LBR. So essentially a lot of profiling can now be done without frequent interrupts, dropping the overhead significantly. The main requirement still is to use a period, and not use frequency mode, because frequency mode requires reevaluating the frequency on each overflow. The floating point state (XMM) is also supported, which allows efficient profiling of FP function arguments. Introduce specific drain function to handle variable length records. Use a new callback to parse the new record format, and also handle the STATUS field now being at a different offset. Add code to set up the configuration register. Since there is only a single register, all events either get the full super set of all events, or only the basic record. Originally-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-6-kan.liang@linux.intel.com [ Renamed GPRS => GP. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ed5194c2 |
|
18-Jan-2019 |
Andi Kleen <ak@linux.intel.com> |
x86/speculation/mds: Add basic bug infrastructure for MDS Microarchitectural Data Sampling (MDS), is a class of side channel attacks on internal buffers in Intel CPUs. The variants are: - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126) - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130) - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127) MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a dependent load (store-to-load forwarding) as an optimization. The forward can also happen to a faulting or assisting load operation for a different memory address, which can be exploited under certain conditions. Store buffers are partitioned between Hyper-Threads so cross thread forwarding is not possible. But if a thread enters or exits a sleep state the store buffer is repartitioned which can expose data from one thread to the other. MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage L1 miss situations and to hold data which is returned or sent in response to a memory or I/O operation. Fill buffers can forward data to a load operation and also write data to the cache. When the fill buffer is deallocated it can retain the stale data of the preceding operations which can then be forwarded to a faulting or assisting load operation, which can be exploited under certain conditions. Fill buffers are shared between Hyper-Threads so cross thread leakage is possible. MLDPS leaks Load Port Data. Load ports are used to perform load operations from memory or I/O. The received data is then forwarded to the register file or a subsequent operation. In some implementations the Load Port can contain stale data from a previous operation which can be forwarded to faulting or assisting loads under certain conditions, which again can be exploited eventually. Load ports are shared between Hyper-Threads so cross thread leakage is possible. All variants have the same mitigation for single CPU thread case (SMT off), so the kernel can treat them as one MDS issue. Add the basic infrastructure to detect if the current CPU is affected by MDS. [ tglx: Rewrote changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
|
#
d8eabc37 |
|
20-Feb-2019 |
Thomas Gleixner <tglx@linutronix.de> |
x86/msr-index: Cleanup bit defines Greg pointed out that speculation related bit defines are using (1 << N) format instead of BIT(N). Aside of that (1 << N) is wrong as it should use 1UL at least. Clean it up. [ Josh Poimboeuf: Fix tools build ] Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
|
#
52f64909 |
|
05-Mar-2019 |
Peter Zijlstra (Intel) <peterz@infradead.org> |
x86: Add TSX Force Abort CPUID/MSR Skylake systems will receive a microcode update to address a TSX errata. This microcode will (by default) clobber PMC3 when TSX instructions are (speculatively or not) executed. It also provides an MSR to cause all TSX transaction to abort and preserve PMC3. Add the CPUID enumeration and MSR definition. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
f99e3daf |
|
24-Oct-2018 |
Chao Peng <chao.p.peng@linux.intel.com> |
KVM: x86: Add Intel PT virtualization work mode Intel Processor Trace virtualization can be work in one of 2 possible modes: a. System-Wide mode (default): When the host configures Intel PT to collect trace packets of the entire system, it can leave the relevant VMX controls clear to allow VMX-specific packets to provide information across VMX transitions. KVM guest will not aware this feature in this mode and both host and KVM guest trace will output to host buffer. b. Host-Guest mode: Host can configure trace-packet generation while in VMX non-root operation for guests and root operation for native executing normally. Intel PT will be exposed to KVM guest in this mode, and the trace output to respective buffer of host and guest. In this mode, tht status of PT will be saved and disabled before VM-entry and restored after VM-exit if trace a virtual machine. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
69843a91 |
|
24-Oct-2018 |
Luwei Kang <luwei.kang@intel.com> |
perf/x86/intel/pt: Add new bit definitions for PT MSRs Add bit definitions for Intel PT MSRs to support trace output directed to the memeory subsystem and holds a count if packet bytes that have been sent out. These are required by the upcoming PT support in KVM guests for MSRs read/write emulation. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
887eda13 |
|
24-Oct-2018 |
Chao Peng <chao.p.peng@linux.intel.com> |
perf/x86/intel/pt: Move Intel PT MSRs bit defines to global header The Intel Processor Trace (PT) MSR bit defines are in a private header. The upcoming support for PT virtualization requires these defines to be accessible from KVM code. Move them to the global MSR header file. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
0e1b869f |
|
17-Dec-2018 |
Eduardo Habkost <ehabkost@redhat.com> |
kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs Some guests OSes (including Windows 10) write to MSR 0xc001102c on some cases (possibly while trying to apply a CPU errata). Make KVM ignore reads and writes to that MSR, so the guest won't crash. The MSR is documented as "Execution Unit Configuration (EX_CFG)", at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 00h-0Fh Processors". Cc: stable@vger.kernel.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
5bfbe3ad |
|
25-Nov-2018 |
Tim Chen <tim.c.chen@linux.intel.com> |
x86/speculation: Prepare for per task indirect branch speculation control To avoid the overhead of STIBP always on, it's necessary to allow per task control of STIBP. Add a new task flag TIF_SPEC_IB and evaluate it during context switch if SMT is active and flag evaluation is enabled by the speculation control code. Add the conditional evaluation to x86_virt_spec_ctrl() as well so the guest/host switch works properly. This has no effect because TIF_SPEC_IB cannot be set yet and the static key which controls evaluation is off. Preparatory patch for adding the control code. [ tglx: Simplify the context switch logic and make the TIF evaluation depend on SMP=y and on the static key controlling the conditional update. Rename it to TIF_SPEC_IB because it controls both STIBP and IBPB ] Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.176917199@linutronix.de
|
#
af3bdb99 |
|
08-Aug-2018 |
Andi Kleen <ak@linux.intel.com> |
perf/x86/intel: Add a separate Arch Perfmon v4 PMI handler Implements counter freezing for Arch Perfmon v4 (Skylake and newer). This allows to speed up the PMI handler by avoiding unnecessary MSR writes and make it more accurate. The Arch Perfmon v4 PMI handler is substantially different than the older PMI handler. Differences to the old handler: - It relies on counter freezing, which eliminates several MSR writes from the PMI handler and lowers the overhead significantly. It makes the PMI handler more accurate, as all counters get frozen atomically as soon as any counter overflows. So there is much less counting of the PMI handler itself. With the freezing we don't need to disable or enable counters or PEBS. Only BTS which does not support auto-freezing still needs to be explicitly managed. - The PMU acking is done at the end, not the beginning. This makes it possible to avoid manual enabling/disabling of the PMU, instead we just rely on the freezing/acking. - The APIC is acked before reenabling the PMU, which avoids problems with LBRs occasionally not getting unfreezed on Skylake. - Looping is only needed to workaround a corner case which several PMIs are very close to each other. For common cases, the counters are freezed during PMI handler. It doesn't need to do re-check. This patch: - Adds code to enable v4 counter freezing - Fork <=v3 and >=v4 PMI handlers into separate functions. - Add kernel parameter to disable counter freezing. It took some time to debug counter freezing, so in case there are new problems we added an option to turn it off. Would not expect this to be used until there are new bugs. - Only for big core. The patch for small core will be posted later separately. Performance: When profiling a kernel build on Kabylake with different perf options, measuring the length of all NMI handlers using the nmi handler trace point: V3 is without counter freezing. V4 is with counter freezing. The value is the average cost of the PMI handler. (lower is better) perf options ` V3(ns) V4(ns) delta -c 100000 1088 894 -18% -g -c 100000 1862 1646 -12% --call-graph lbr -c 100000 3649 3367 -8% --c.g. dwarf -c 100000 2248 1982 -12% Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/1533712328-2834-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
8e0b2b91 |
|
05-Aug-2018 |
Paolo Bonzini <pbonzini@redhat.com> |
x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is not needed. Add a new value to enum vmx_l1d_flush_state, which is used either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
3fa045be |
|
02-Jul-2018 |
Paolo Bonzini <pbonzini@redhat.com> |
x86/KVM/VMX: Add L1D MSR based flush 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD aka 0x10B) which has similar write-only semantics to other MSRs defined in the document. The semantics of this MSR is to allow "finer granularity invalidation of caching structures than existing mechanisms like WBINVD. It will writeback and invalidate the L1 data cache, including all cachelines brought in by preceding instructions, without invalidating all caches (eg. L2 or LLC). Some processors may also invalidate the first level level instruction cache on a L1D_FLUSH command. The L1 data and instruction caches may be shared across the logical processors of a core." Use it instead of the loop based L1 flush algorithm. A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 [ tglx: Avoid allocating pages when the MSR is available ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
a00072a2 |
|
13-Feb-2018 |
Matt Turner <mattst88@gmail.com> |
x86: msr-index.h: Correct SNB_C1/C3_AUTO_UNDEMOTE defines According to the Intel Software Developers' Manual, Vol. 4, Order No. 335592, these macros have been reversed since they were added in the initial turbostat commit. The reversed definitions were presumably copied from turbostat.c to this file. Fixes: 9c63a650bb10 ("tools/power/x86/turbostat: share kernel MSR #defines") Signed-off-by: Matt Turner <mattst88@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Len Brown <len.brown@intel.com>
|
#
240da953 |
|
16-May-2018 |
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
x86/bugs: Rename SSBD_NO to SSB_NO The "336996 Speculative Execution Side Channel Mitigations" from May defines this as SSB_NO, hence lets sync-up. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
11fb0683 |
|
17-May-2018 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/speculation: Add virtualized speculative store bypass disable support Some AMD processors only support a non-architectural means of enabling speculative store bypass disable (SSBD). To allow a simplified view of this to a guest, an architectural definition has been created through a new CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a hypervisor can virtualize the existence of this definition and provide an architectural method for using SSBD to a guest. Add the new CPUID feature, the new MSR and update the existing SSBD support to use this MSR when present. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de>
|
#
9f65fb29 |
|
09-May-2018 |
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
x86/bugs: Rename _RDS to _SSBD Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2] as SSBD (Speculative Store Bypass Disable). Hence changing it. It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name is going to be. Following the rename it would be SSBD_NO but that rolls out to Speculative Store Bypass Disable No. Also fixed the missing space in X86_FEATURE_AMD_SSBD. [ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
885f82bf |
|
29-Apr-2018 |
Thomas Gleixner <tglx@linutronix.de> |
x86/process: Allow runtime control of Speculative Store Bypass The Speculative Store Bypass vulnerability can be mitigated with the Reduced Data Speculation (RDS) feature. To allow finer grained control of this eventually expensive mitigation a per task mitigation control is required. Add a new TIF_RDS flag and put it into the group of TIF flags which are evaluated for mismatch in switch_to(). If these bits differ in the previous and the next task, then the slow path function __switch_to_xtra() is invoked. Implement the TIF_RDS dependent mitigation control in the slow path. If the prctl for controlling Speculative Store Bypass is disabled or no task uses the prctl then there is no overhead in the switch_to() fast path. Update the KVM related speculation control functions to take TID_RDS into account as well. Based on a patch from Tim Chen. Completely rewritten. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
#
77243971 |
|
25-Apr-2018 |
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
x86/bugs/intel: Set proper CPU features and setup RDS Intel CPUs expose methods to: - Detect whether RDS capability is available via CPUID.7.0.EDX[31], - The SPEC_CTRL MSR(0x48), bit 2 set to enable RDS. - MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable RRS. With that in mind if spec_store_bypass_disable=[auto,on] is selected set at boot-time the SPEC_CTRL MSR to enable RDS if the platform requires it. Note that this does not fix the KVM case where the SPEC_CTRL is exposed to guests which can muck with it, see patch titled : KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS. And for the firmware (IBRS to be set), see patch titled: x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits [ tglx: Distangled it from the intel implementation and kept the call order ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
|
#
e84b7119 |
|
05-Feb-2018 |
Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> |
x86/msr: Add AMD Core Perf Extension MSRs Add the EventSelect and Counter MSRs for AMD Core Perf Extension. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
#
1e340c60 |
|
25-Jan-2018 |
David Woodhouse <dwmw@amazon.co.uk> |
x86/msr: Add definitions for new speculation control MSRs Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES. See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk
|
#
9c6a73c7 |
|
08-Jan-2018 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference to MFENCE_RDTSC. However, since the kernel could be running under a hypervisor that does not support writing that MSR, read the MSR back and verify that the bit has been set successfully. If the MSR can be read and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the MFENCE_RDTSC feature. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdoffice.net
|
#
e4d0e84e |
|
08-Jan-2018 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/cpu/AMD: Make LFENCE a serializing instruction To aid in speculation control, make LFENCE a serializing instruction since it has less overhead than MFENCE. This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families that support LFENCE do not have this MSR. For these families, the LFENCE instruction is already serializing. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdoffice.net
|
#
18c71ce9 |
|
04-Dec-2017 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/CPU/AMD: Add the Secure Encrypted Virtualization CPU feature Update the CPU features to include identifying and reporting on the Secure Encrypted Virtualization (SEV) feature. SEV is identified by CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR). Only show the SEV feature as available if reported by CPUID and enabled by BIOS. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Borislav Petkov <bp@suse.de>
|
#
1958b5fc |
|
20-Oct-2017 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/boot: Add early boot support when running with SEV active Early in the boot process, add checks to determine if the kernel is running with Secure Encrypted Virtualization (SEV) active. Checking for SEV requires checking that the kernel is running under a hypervisor (CPUID 0x00000001, bit 31), that the SEV feature is available (CPUID 0x8000001f, bit 1) and then checking a non-interceptable SEV MSR (0xc0010131, bit 0). This check is required so that during early compressed kernel booting the pagetables (both the boot pagetables and KASLR pagetables (if enabled) are updated to include the encryption mask so that when the kernel is decompressed into encrypted memory, it can boot properly. After the kernel is decompressed and continues booting the same logic is used to check if SEV is active and set a flag indicating so. This allows to distinguish between SME and SEV, each of which have unique differences in how certain things are handled: e.g. DMA (always bounce buffered with SEV) or EFI tables (always access decrypted with SME). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: kvm@vger.kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-13-brijesh.singh@amd.com
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
872cbefd |
|
17-Jul-2017 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/cpu/AMD: Add the Secure Memory Encryption CPU feature Update the CPU features to include identifying and reporting on the Secure Memory Encryption (SME) feature. SME is identified by CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of MSR_K8_SYSCFG). Only show the SME feature as available if reported by CPUID, enabled by BIOS and not configured as CONFIG_X86_32=y. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/85c17ff450721abccddc95e611ae8df3f4d9718b.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
4531662d |
|
23-May-2017 |
Jim Mattson <jmattson@google.com> |
kvm: vmx: Check value written to IA32_BNDCFGS Bits 11:2 must be zero and the linear addess in bits 63:12 must be canonical. Otherwise, WRMSR(BNDCFGS) should raise #GP. Fixes: 0dd376e709975779 ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save") Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
#
6089327f |
|
12-May-2017 |
Kan Liang <Kan.liang@intel.com> |
perf/x86: Add sysfs entry to freeze counters on SMI Currently, the SMIs are visible to all performance counters, because many users want to measure everything including SMIs. But in some cases, the SMI cycles should not be counted - for example, to calculate the cost of an SMI itself. So a knob is needed. When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance counters will be effected. There is no way to do per-counter freeze on SMI. So it should not use the per-event interface (e.g. ioctl or event attribute) to set FREEZE_WHILE_SMM bit. Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages while in SMM. Value has to be 0 or 1. It will be applied to all processors. Also serialize the entire setting so we don't get multiple concurrent threads trying to update to different values. Signed-off-by: Kan Liang <Kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: bp@alien8.de Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
2fc49cb0 |
|
28-Apr-2017 |
Len Brown <len.brown@intel.com> |
x86: msr-index.h: fix shifts to ULL results in HWP macros. x = 1 ulong_long = x << 32; results in: warning: left shift count >= width of type x = 8 ulong_long = x << 24; results in a sign extended ulong_long Cast x to unsigned long long in these macros to prevent these errors. Signed-off-by: Len Brown <len.brown@intel.com>
|
#
8d84e906 |
|
25-Feb-2017 |
Len Brown <len.brown@intel.com> |
x86: msr-index.h: define HWP.EPP values The Hardware Performance State request MSR has a field to express the "Energy Performance Preference" (HWP.EPP). Decode that field so the definition may be shared by by the intel_pstate driver and any utilities that decode the same register. Signed-off-by: Len Brown <len.brown@intel.com>
|
#
e9ea1e7f |
|
20-Mar-2017 |
Kyle Huey <me@kylehuey.com> |
x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Intel supports faulting on the CPUID instruction beginning with Ivy Bridge. When enabled, the processor will fault on attempts to execute the CPUID instruction with CPL>0. Exposing this feature to userspace will allow a ptracer to trap and emulate the CPUID instruction. When supported, this feature is controlled by toggling bit 0 of MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of https://bugzilla.kernel.org/attachment.cgi?id=243991 Implement a new pair of arch_prctls, available on both x86-32 and x86-64. ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting is enabled (and thus the CPUID instruction is not available) or 1 if CPUID faulting is not enabled. ARCH_SET_CPUID: Set the CPUID state to the second argument. If cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will be deactivated. Returns ENODEV if CPUID faulting is not supported on this system. The state of the CPUID faulting flag is propagated across forks, but reset upon exec. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-kselftest@vger.kernel.org Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Robert O'Callahan <robert@ocallahan.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: Jeff Dike <jdike@addtoit.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: user-mode-linux-user@lists.sourceforge.net Cc: David Matlack <dmatlack@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: linux-fsdevel@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/20170320081628.18952-9-khuey@kylehuey.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
90218ac7 |
|
20-Mar-2017 |
Kyle Huey <me@kylehuey.com> |
x86/cpufeature: Detect CPUID faulting support Intel supports faulting on the CPUID instruction beginning with Ivy Bridge. When enabled, the processor will fault on attempts to execute the CPUID instruction with CPL>0. This will allow a ptracer to emulate the CPUID instruction. Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is documented in detail in Section 2.3.2 of https://bugzilla.kernel.org/attachment.cgi?id=243991 Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-kselftest@vger.kernel.org Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Robert O'Callahan <robert@ocallahan.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: Jeff Dike <jdike@addtoit.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: user-mode-linux-user@lists.sourceforge.net Cc: David Matlack <dmatlack@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: linux-fsdevel@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/20170320081628.18952-8-khuey@kylehuey.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
ab6d9468 |
|
20-Mar-2017 |
Kyle Huey <me@kylehuey.com> |
x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES This matches the only public Intel documentation of this MSR, in the "Virtualization Technology FlexMigration Application Note" (preserved at https://bugzilla.kernel.org/attachment.cgi?id=243991) Signed-off-by: Kyle Huey <khuey@kylehuey.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-kselftest@vger.kernel.org Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Robert O'Callahan <robert@ocallahan.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: Jeff Dike <jdike@addtoit.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: user-mode-linux-user@lists.sourceforge.net Cc: David Matlack <dmatlack@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: linux-fsdevel@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/20170320081628.18952-2-khuey@kylehuey.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
b9894a2f |
|
14-Feb-2017 |
Kyle Huey <me@kylehuey.com> |
x86/process: Correct and optimize TIF_BLOCKSTEP switch The debug control MSR is "highly magical" as the blockstep bit can be cleared by hardware under not well documented circumstances. So a task switch relying on the bit set by the previous task (according to the previous tasks thread flags) can trip over this and not update the flag for the next task. To fix this its required to handle DEBUGCTLMSR_BTF when either the previous or the next or both tasks have the TIF_BLOCKSTEP flag set. While at it avoid branching within the TIF_BLOCKSTEP case and evaluating boot_cpu_data twice in kernels without CONFIG_X86_DEBUGCTLMSR. x86_64: arch/x86/kernel/process.o text data bss dec hex 3024 8577 16 11617 2d61 Before 3008 8577 16 11601 2d51 After i386: No change [ tglx: Made the shift value explicit, use a local variable to make the code readable and massaged changelog] Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kyle Huey <khuey@kylehuey.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: http://lkml.kernel.org/r/20170214081104.9244-3-khuey@kylehuey.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
98af7459 |
|
20-Jan-2017 |
Len Brown <len.brown@intel.com> |
x86 msr_index.h: Define MSR_MISC_FEATURE_CONTROL This non-architectural MSR has disable bits for various prefetchers on modern processors. While these bits are generally touched only by the BIOS, say, via BIOS SETUP, it is useful to dump them when examining options that can alter performance. Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
|
#
8a34fd02 |
|
12-Jan-2017 |
Len Brown <len.brown@intel.com> |
x86 msr-index.h: Define Atom specific core ratio MSR locations These MSRs are currently used by the intel_pstate driver, using a local definition. Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
|
#
0539ba11 |
|
09-Feb-2017 |
Len Brown <len.brown@intel.com> |
tools/power turbostat: Baytrail c-state support The Baytrail SOC, with its Silvermont core, has some unique properties: 1. a hardware CC1 residency counter 2. a module-c6 residency counter 3. a package-c6 counter at traditional package-c7 counter address. The SOC does not support c3, pc3, c7 or pc7 counters. Signed-off-by: Len Brown <len.brown@intel.com>
|
#
419c9e98 |
|
07-Jan-2017 |
Len Brown <len.brown@intel.com> |
x86: msr-index.h: Remove unused MSR_NHM_SNB_PKG_CST_CFG_CTL The two users, intel_idle driver and turbostat utility are using the new name, MSR_PKG_CST_CONFIG_CONTROL Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
|
#
40496c8e |
|
07-Jan-2017 |
Len Brown <len.brown@intel.com> |
x86: msr-index.h: Define MSR_PKG_CST_CONFIG_CONTROL define MSR_PKG_CST_CONFIG_CONTROL (0xE2), which is the string used by Intel Documentation. We use this MSR in intel_idle and turbostat by a previous name, to be updated in the next patch. Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
|
#
d0117a0e |
|
25-Feb-2017 |
Len Brown <len.brown@intel.com> |
x86: msr-index.h: define EPB mid-points These are currently open-coded into intel_pstate.c Signed-off-by: Len Brown <len.brown@intel.com>
|
#
ae47eda9 |
|
20-Jan-2017 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
x86/msr: Add MSR_MISC_FEATURE_ENABLES and RING3MWAIT bit Define new MSR MISC_FEATURE_ENABLES (0x140). On supported CPUs if bit 1 of this MSR is set, then calling MONITOR and MWAIT instructions outside of ring 0 will not cause invalid-opcode exception. The MSR MISC_FEATURE_ENABLES is not yet documented in the SDM. Here is the relevant documentation: Hex Dec Name Scope 140H 320 MISC_FEATURE_ENABLES Thread 0 Reserved 1 If set to 1, the MONITOR and MWAIT instructions do not cause invalid-opcode exceptions when executed with CPL > 0 or in virtual-8086 mode. If MWAIT is executed when CPL > 0 or in virtual-8086 mode, and if EAX indicates a C-state other than C0 or C1, the instruction operates as if EAX indicated the C-state C1. 63:2 Reserved Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-2-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
3f5a7896 |
|
18-Nov-2016 |
Tony Luck <tony.luck@intel.com> |
x86/mce: Include the PPIN in MCE records when available Intel Xeons from Ivy Bridge onwards support a processor identification number set in the factory. To the user this is a handy unique number to identify a particular CPU. Intel can decode this to the fab/production run to track errors. On systems that have it, include it in the machine check record. I'm told that this would be helpful for users that run large data centers with multi-socket servers to keep track of which CPUs are seeing errors. Boris: * Add some clarifying comments and spacing. * Mask out [63:2] in the disabled-but-not-locked case * Call the MSR variable "val" for more readability. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20161123114855.njguoaygp3qnbkia@pd.tnic Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
c836eeda |
|
13-Oct-2016 |
Longpeng(Mike) <longpeng2@huawei.com> |
x86: Remove duplicate rtit status MSR macro The MSR_IA32_RTIT_STATUS is defined twice, so remove one. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: len.brown@intel.com Cc: peterz@infradead.org Cc: rafael.j.wysocki@intel.com Cc: alexander.shishkin@linux.intel.com Cc: ray.huang@amd.com Cc: Aravind.Gopalakrishnan@amd.com Cc: wu.wubin@huawei.com Cc: srinivas.pandruvada@linux.intel.com Cc: zhaoshenglong@huawei.com Cc: vladimir_zapolskiy@mentor.com Link: http://lkml.kernel.org/r/1476405740-80816-1-git-send-email-longpeng2@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
a0c9b8cc |
|
06-Jul-2016 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
x86: remove duplicate turbo ratio limit MSRs Remove MSR_NHM_TURBO_RATIO_LIMIT and MSR_IVT_TURBO_RATIO_LIMIT as they are duplicate. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
f127fa09 |
|
27-Apr-2016 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel/pt: Add IP filtering register/CPUID bits New versions of Intel PT support address range-based filtering. Add the new registers, bit definitions and relevant CPUID bits. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0dd28e2c |
|
27-Apr-2016 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel/pt: Move PT specific MSR bit definitions to a private header Nothing outside of the Intel PT driver should ever care about its MSR bits, so there is no reason to keep them in msr-index.h. This patch moves them to a pt-local header. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
dcee75b3 |
|
17-Apr-2016 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
perf/x86/intel/rapl: Support Skylake RAPL domains Add Skylake client support for RAPL domains. In addition to RAPL domains in Broadwell clients, it has support for platform domain (aka PSys). The PSys domain controls the entire SoC instead of just a CPU package. Unlike package domain, PSys support requires more than just processor level implementation. The other parts in the system need additional HW level signaling, which OEMs need to support. When not supported, the energy counter register in PSys domain returns 0. Also corrected error in comment for GPU counter, which previously was DRAM counter. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com [ Cnverted to model_match stuff. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jacob.jun.pan@linux.intel.com Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/1460930581-29748-2-git-send-email-srinivas.pandruvada@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
5dcef694 |
|
06-Apr-2016 |
Len Brown <len.brown@intel.com> |
intel_idle: add BXT support Broxton has all the HSW C-states, except C3. BXT C-state timing is slightly different. Here we trust the IRTL MSRs as authority on maximum C-state latency, and override the driver's tables with the values found in the associated IRTL MSRs. Further we set the target_residency to 1x maximum latency, trusting the hardware demotion logic. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
5a63426e |
|
06-Apr-2016 |
Len Brown <len.brown@intel.com> |
tools/power turbostat: print IRTL MSRs Some processors use the Interrupt Response Time Limit (IRTL) MSR value to describe the maximum IRQ response time latency for deep package C-states. (Though others have the register, but do not use it) Lets print it out to give insight into the cases where it is used. IRTL begain in SNB, with PC3/PC6/PC7, and HSW added PC8/PC9/PC10. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
aaf24884 |
|
29-Jan-2016 |
Huang Rui <ray.huang@amd.com> |
perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter AMD Zeppelin (Family 17h, Model 00h) introduces an instructions retired performance counter which is indicated by CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register (MSR 0xC000_000E9) increments once for every instruction retired. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.huang@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
8a224261 |
|
29-Jan-2016 |
Huang Rui <ray.huang@amd.com> |
perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100 MHz rate in all P-states, and C states, S0, or S1. The frequency is about 100MHz. This counter will be used to calculate processor power and other parts. So add an interface into the MSR PMU to get the PTSC counter value. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.huang@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
4a6772f5 |
|
26-Mar-2016 |
Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> |
x86/cpufreq: Remove duplicated TDP MSR macro definitions The list of CPU model specific registers contains two copies of TDP registers, remove the one, which is out of numerical order in the list. Fixes: 6a35fc2d6c22 ("cpufreq: intel_pstate: get P1 from TAR when available") Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Cc: Len Brown <len.brown@intel.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: http://lkml.kernel.org/r/1459018020-24577-1-git-send-email-vladimir_zapolskiy@mentor.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
adc53f2e |
|
07-Mar-2016 |
Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> |
x86/mce: Move MCx_CONFIG MSR definitions Those MSRs are used only by the MCE code so move them there. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1456785179-14378-2-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
053080a9 |
|
16-Feb-2016 |
Borislav Petkov <bp@suse.de> |
x86/msr: Document msr-index.h rule for addition In order to keep this file's size sensible and not cause too much unnecessary churn, make the rule explicit - similar to pci_ids.h - that only MSRs which are used in multiple compilation units, should get added to it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alex.williamson@redhat.com Cc: gleb@kernel.org Cc: joro@8bytes.org Cc: kvm@vger.kernel.org Cc: sherry.hurwitz@amd.com Cc: wei@redhat.com Link: http://lkml.kernel.org/r/1455612202-14414-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
670e27d8 |
|
30-Nov-2015 |
Len Brown <len.brown@intel.com> |
x86 msr-index: Simplify syntax for HWP fields syntax only, no functional change Signed-off-by: Len Brown <len.brown@intel.com>
|
#
e6c8f187 |
|
25-Jan-2016 |
Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> |
x86/mce/AMD: Set MCAX Enable bit It is required for the OS to acknowledge that it is using the MCAX register set and its associated fields by setting the 'McaXEnable' bit in each bank's MCi_CONFIG register. If it is not set, then all UC errors will cause a system panic. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1453750913-4781-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ae8b7875 |
|
23-Nov-2015 |
Borislav Petkov <bp@suse.de> |
x86/cpu/amd, kvm: Satisfy guest kernel reads of IC_CFG MSR The kernel accesses IC_CFG MSR (0xc0011021) on AMD because it checks whether the way access filter is enabled on some F15h models, and, if so, disables it. kvm doesn't handle that MSR access and complains about it, which can get really noisy in dmesg when one starts kvm guests all the time for testing. And it is useless anyway - guest kernel shouldn't be doing such changes anyway so tell it that that filter is disabled. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1448273546-2567-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
5369a21e |
|
12-Nov-2015 |
Len Brown <len.brown@intel.com> |
x86: remove unused definition of MSR_NHM_PLATFORM_INFO MSR_NHM_PLATFORM_INFO has been replaced by... MSR_PLATFORM_INFO Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
6a35fc2d |
|
14-Oct-2015 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
cpufreq: intel_pstate: get P1 from TAR when available After Ivybridge, the max non turbo ratio obtained from platform info msr is not always guaranteed P1 on client platforms. The max non turbo activation ratio (TAR), determines the max for the current level of TDP. The ratio in platform info is physical max. The TAR MSR can be locked, so updating this value is not possible on all platforms. This change gets this ratio from MSR TURBO_ACTIVATION_RATIO if available, but also do some sanity checking to make sure that this value is correct. The sanity check involves reading the TDP ratio for the current tdp control value when platform has configurable TDP present and matching TAC with this. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
#
3afb1121 |
|
18-Sep-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: x86: trap AMD MSRs for the TSeg base and mask These have roughly the same purpose as the SMRR, which we do not need to implement in KVM. However, Linux accesses MSR_K8_TSEG_ADDR at boot, which causes problems when running a Xen dom0 under KVM. Just return 0, meaning that processor protection of SMRAM is not in effect. Reported-by: M A Young <m.a.young@durham.ac.uk> Cc: stable@vger.kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
d0dc8494 |
|
09-Sep-2015 |
Andi Kleen <ak@linux.intel.com> |
perf/x86/intel/pebs: Add PEBS frontend profiling for Skylake Skylake has a new FRONTEND_LATENCY PEBS event to accurately profile frontend problems (like ITLB or decoding issues). The new event is configured through a separate MSR, which selects a range of sub events. Define the extra MSR as a extra reg and export support for it through sysfs. To avoid duplicating the existing tables use a new function to add new entries to existing tables. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1435707205-6676-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
b83ff1c8 |
|
10-May-2015 |
Andi Kleen <ak@linux.intel.com> |
x86: Add new MSRs and MSR bits used for Intel Skylake PMU support Add new MSRs (LBR_INFO) and some new MSR bits used by the Intel Skylake PMU driver. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285767-27027-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
b1bf72d6 |
|
30-Jul-2015 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf/x86/intel/pt: Add new timing packet enables Intel PT chapter in the new Intel Architecture SDM adds several packets corresponding enable bits and registers that control packet generation. Also, additional bits in the Intel PT CPUID leaf were added to enumerate presence and parameters of these new packets and features. The packets and enables are: * CYC: cycle accurate mode, provides the number of cycles elapsed since previous CYC packet; its presence and available threshold values are enumerated via CPUID; * MTC: mini time counter packets, used for tracking TSC time between full TSC packets; its presence and available resolution options are enumerated via CPUID; * PSB packet period is now configurable, available period values are enumerated via CPUID. This patch adds corresponding bit and register definitions, pmu driver capabilities based on CPUID enumeration, new attribute format bits for the new featurens and extends event configuration validation function to take these into account. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
b72e7464 |
|
04-Jun-2015 |
Borislav Petkov <bp@suse.de> |
x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers This header containing all MSRs and respective bit definitions got exported to userspace in conjunction with the big UAPI shuffle. But, it doesn't belong in the UAPI headers because userspace can do its own MSR defines and exporting them from the kernel blocks us from doing cleanups/renames in that header. Which is ridiculous - it is not kernel's job to export such a header and keep MSRs list and their names stable. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1433436928-31903-19-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ba904635 |
|
29-Nov-2012 |
Will Auld <will.auld.intel@gmail.com> |
KVM: x86: Emulate IA32_TSC_ADJUST MSR CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported Basic design is to emulate the MSR by allowing reads and writes to a guest vcpu specific location to store the value of the emulated MSR while adding the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This is of course as long as the "use TSC counter offsetting" VM-execution control is enabled as well as the IA32_TSC_ADJUST control. However, because hardware will only return the TSC + IA32_TSC_ADJUST + vmsc tsc_offset for a guest process when it does and rdtsc (with the correct settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one of these three locations. The argument against storing it in the actual MSR is performance. This is likely to be seldom used while the save/restore is required on every transition. IA32_TSC_ADJUST was created as a way to solve some issues with writing TSC itself so that is not an option either. The remaining option, defined above as our solution has the problem of returning incorrect vmcs tsc_offset values (unless we intercept and fix, not done here) as mentioned above. However, more problematic is that storing the data in vmcs tsc_offset will have a different semantic effect on the system than does using the actual MSR. This is illustrated in the following example: The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest process performs a rdtsc. In this case the guest process will get TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics as seen by the guest do not and hence this will not cause a problem. Signed-off-by: Will Auld <will.auld@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
3fc808aa |
|
31-Oct-2012 |
Len Brown <len.brown@intel.com> |
x86 power: define RAPL MSRs The Run Time Average Power Limiting interface is currently model specific, present on Sandy Bridge and Ivy Bridge processors. These #defines correspond to documentation in the latest "Intel® 64 and IA-32 Architectures Software Developer Manual", plus some typos in that document corrected. Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
|
#
9c63a650 |
|
30-Oct-2012 |
Len Brown <len.brown@intel.com> |
tools/power/x86/turbostat: share kernel MSR #defines Now that turbostat is built in the kernel tree, it can share MSR #defines with the kernel. Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
|
#
279f1461 |
|
22-Oct-2012 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86: apic: Use tsc deadline for oneshot when available If the TSC deadline mode is supported, LAPIC timer one-shot mode can be implemented using IA32_TSC_DEADLINE MSR. An interrupt will be generated when the TSC value equals or exceeds the value in the IA32_TSC_DEADLINE MSR. This enables us to skip the APIC calibration during boot. Also, in xapic mode, this enables us to skip the uncached apic access to re-arm the APIC timer. As this timer ticks at the high frequency TSC rate, we use the TSC_DIVISOR (32) to work with the 32-bit restrictions in the clockevent API's to avoid 64-bit divides etc (frequency is u32 and "unsigned long" in the set_next_event(), max_delta limits the next event to 32-bit for 32-bit kernel). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: venki@google.com Cc: len.brown@intel.com Link: http://lkml.kernel.org/r/1350941878.6017.31.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
e717bf4e |
|
26-Sep-2012 |
Vince Weaver <vincent.weaver@maine.edu> |
perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU The following patch adds perf_event support for the Xeon-Phi PMU, as documented in the "Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units" manual. Even though it is a co-processor, a Phi runs a full Linux environment and can support performance counters. This is just barebones support, it does not add support for interesting new features such as the SPFLT intruction that allows starting/stopping events without entering the kernel. The PMU internally is just like that of an original Pentium, but a "P6-like" MSR interface is provided. The interface is different enough from a real P6 that it's not easy (or practical) to re-use the code in perf_event_p6.c Acked-by: Lawrence F Meadows <lawrence.f.meadows@intel.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1209261405320.8398@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
f594065f |
|
04-Sep-2012 |
Matthew Garrett <mjg@redhat.com> |
ACPI: Add fixups for AMD P-state figures Some AMD systems may round the frequencies in ACPI tables to 100MHz boundaries. We can obtain the real frequencies from MSRs, so add a quirk to fix these frequencies up on AMD systems. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
#
3dc9a633 |
|
04-Sep-2012 |
Matthew Garrett <mjg@redhat.com> |
acpi-cpufreq: Add support for modern AMD CPUs The programming model for P-states on modern AMD CPUs is very similar to that of Intel and VIA. It makes sense to consolidate this support into one driver rather than duplicating functionality between two of them. This patch adds support for AMDs with hardware P-state control to acpi-cpufreq. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
#
b7074f1f |
|
15-Dec-2011 |
Robert Richter <robert.richter@amd.com> |
perf/x86: Implement IBS interrupt handler This patch implements code to handle ibs interrupts. If ibs data is available a raw perf_event data sample is created and sent back to the userland. This patch only implements the storage of ibs data in the raw sample, but this could be extended in a later patch by generating generic event data such as the rip from the ibs sampling data. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323968199-9326-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
225ce539 |
|
09-Feb-2012 |
Stephane Eranian <eranian@google.com> |
perf/x86: Add Intel LBR MSR definitions This patch adds the LBR definitions for NHM/WSM/SNB and Core. It also adds the definitions for the architected LBR MSR: LBR_SELECT, LBRT_TOS. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
b90dfb04 |
|
22-Sep-2011 |
Liu, Jinsong <jinsong.liu@intel.com> |
x86: TSC deadline definitions This pre-defination is preparing for KVM tsc deadline timer emulation, but theirself are not kvm specific. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
4bb82178 |
|
14-Jul-2011 |
H. Peter Anvin <hpa@linux.intel.com> |
x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE Fix a trivial typo in the name of the constant ENERGY_PERF_BIAS_POWERSAVE. This didn't cause trouble because this constant is not currently used for anything. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/tip-abe48b108247e9b90b4c6739662a2e5c765ed114@git.kernel.org
|
#
abe48b10 |
|
13-Jul-2011 |
Len Brown <len.brown@intel.com> |
x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS Since 2.6.36 (23016bf0d25), Linux prints the existence of "epb" in /proc/cpuinfo, Since 2.6.38 (d5532ee7b40), the x86_energy_perf_policy(8) utility has been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS. However, the typical BIOS fails to initialize the MSR, presumably because this is handled by high-volume shrink-wrap operating systems... Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8). As a result, WSM-EP, SNB, and later hardware from Intel will run in its default hardware power-on state (performance), which assumes that users care for performance at all costs and not for energy efficiency. While that is fine for performance benchmarks, the hardware's intended default operating point is "normal" mode... Initialize the MSR to the "normal" by default during kernel boot. x86_energy_perf_policy(8) is available to change the default after boot, should the user have a different preference. Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980 Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>
|
#
b87a51ae |
|
25-May-2011 |
Nadav Har'El <nyh@il.ibm.com> |
KVM: nVMX: Implement reading and writing of VMX MSRs When the guest can use VMX instructions (when the "nested" module option is on), it should also be able to read and write VMX MSRs, e.g., to query about VMX capabilities. This patch adds this support. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
fbc0db76 |
|
25-Mar-2011 |
Joerg Roedel <joerg.roedel@amd.com> |
KVM: SVM: Implement infrastructure for TSC_RATE_MSR This patch enhances the kvm_amd module with functions to support the TSC_RATE_MSR which can be used to set a given tsc frequency for the guest vcpu. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5bbc097d |
|
15-Apr-2011 |
Joerg Roedel <joerg.roedel@amd.com> |
x86, amd: Disable GartTlbWlkErr when BIOS forgets it This patch disables GartTlbWlk errors on AMD Fam10h CPUs if the BIOS forgets to do is (or is just too old). Letting these errors enabled can cause a sync-flood on the CPU causing a reboot. The AMD BKDG recommends disabling GART TLB Wlk Error completely. This patch is the fix for https://bugzilla.kernel.org/show_bug.cgi?id=33012 on my machine. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.org Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Cc: <stable@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
91c9c3ed |
|
20-Jan-2011 |
john cooper <john.cooper@redhat.com> |
KVM: x86: handle guest access to BBL_CR_CTL3 MSR A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
a7e3ed1e |
|
02-Mar-2011 |
Andi Kleen <ak@linux.intel.com> |
perf: Add support for supplementary event registers Change logs against Andi's original version: - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra) - Fixed a major event scheduling issue. There cannot be a ref++ on an event that has already done ref++ once and without calling put_constraint() in between. (Stephane Eranian) - Use thread_cpumask for percore allocation. (Lin Ming) - Use MSR names in the extra reg lists. (Lin Ming) - Remove redundant "c = NULL" in intel_percore_constraints - Fix comment of perf_event_attr::config1 Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event that can be used to monitor any offcore accesses from a core. This is a very useful event for various tunings, and it's also needed to implement the generic LLC-* events correctly. Unfortunately this event requires programming a mask in a separate register. And worse this separate register is per core, not per CPU thread. This patch: - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters. The extra parameters are passed by user space in the perf_event_attr::config1 field. - Adds support to the Intel perf_event core to schedule per core resources. This adds fairly generic infrastructure that can be also used for other per core resources. The basic code has is patterned after the similar AMD northbridge constraints code. Thanks to Stephane Eranian who pointed out some problems in the original version and suggested improvements. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bfb53ccf |
|
15-Feb-2011 |
Len Brown <len.brown@intel.com> |
intel_idle: disable Atom/Lincroft HW C-state auto-demotion Just as we had to disable auto-demotion for NHM/WSM, we need to do the same for Atom (Lincroft version). In particular, auto-demotion will prevent Lincroft from entering the S0i3 idle power saving state. https://bugzilla.kernel.org/show_bug.cgi?id=25252 Signed-off-by: Len Brown <len.brown@intel.com>
|
#
14796fca |
|
18-Jan-2011 |
Len Brown <len.brown@intel.com> |
intel_idle: disable NHM/WSM HW C-state auto-demotion Hardware C-state auto-demotion is a mechanism where the HW overrides the OS C-state request, instead demoting to a shallower state, which is less expensive, but saves less power. Modern Linux should generally get exactly the states it requests. In particular, when a CPU is taken off-line, it must not be demoted, else it can prevent the entire package from reaching deep C-states. https://bugzilla.kernel.org/show_bug.cgi?id=25252 Signed-off-by: Len Brown <len.brown@intel.com>
|
#
9e76a97e |
|
03-Jan-2011 |
R, Durgadoss <durgadoss.r@intel.com> |
x86, hwmon: Add core threshold notification to therm_throt.c This patch adds code to therm_throt.c to notify core thermal threshold events. These thresholds are supported by the IA32_THERM_INTERRUPT register. The status/log for the same is monitored using the IA32_THERM_STATUS register. The necessary #defines are in msr-index.h. A call back is added to mce.h, to further notify the thermal stack, about the threshold events. Signed-off-by: Durgadoss R <durgadoss.r@intel.com> LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
da169f5d |
|
24-Sep-2010 |
Robert Richter <robert.richter@amd.com> |
oprofile, x86: Add support for 6 counters (AMD family 15h) This patch adds support for up to 6 hardware counters for AMD family 15h cpus. There is a new MSR range for hardware counters beginning at MSRC001_0200 Performance Event Select (PERF_CTL0). Signed-off-by: Robert Richter <robert.richter@amd.com>
|
#
37db6c8f |
|
16-Nov-2010 |
Jan Beulich <JBeulich@novell.com> |
x86-64: Fix and clean up AMD Fam10 MMCONF enabling Candidate memory ranges were not calculated properly (start addresses got needlessly rounded down, and end addresses didn't get rounded up at all), address comparison for secondary CPUs was done on only part of the address, and disabled status wasn't tracked properly. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <4CE24DF40200007800022737@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
b9a52c4b |
|
08-Sep-2010 |
Jes Sorensen <Jes.Sorensen@redhat.com> |
x86: Define MSR_EBC_FREQUENCY_ID Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
25da6950 |
|
21-Sep-2010 |
Robert Richter <robert.richter@amd.com> |
oprofile, x86: Add support for IBS branch target address reporting This patch adds support for IBS branch target address reporting. A new MSR (MSRC001_103B IBS Branch Target Address) has been added that provides the logical address in canonical form for the branch target. The size of the IBS sample that is transferred to the userland has been increased. For backward compatibility, the userland daemon must explicit enable the feature by writing to the oprofilefs file ibs_op/branch_target After enabling branch target address reporting, the userland daemon must handle the extended size of the IBS sample. Signed-off-by: Robert Richter <robert.richter@amd.com>
|
#
eec4b140 |
|
05-May-2010 |
Joerg Roedel <joerg.roedel@amd.com> |
KVM: SVM: Allow EFER.LMSLE to be set with nested svm This patch enables setting of efer bit 13 which is allowed in all SVM capable processors. This is necessary for the SLES11 version of Xen 4.0 to boot with nested svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
9792db61 |
|
29-Jul-2010 |
Fenghua Yu <fenghua.yu@intel.com> |
x86, cpu: Package Level Thermal Control, Power Limit Notification definitions Add package level thermal and power limit feature support. The two MSRs and features are new starting with Intel's Sandy Bridge processor. Please check Intel 64 and IA-32 Architectures SDMV Vol 3A 14.5.6 Power Limit Notification and 14.6 Package Level Thermal Management. This patch also fixes a bug which defines reverse THERM_INT_LOW_ENABLE bit and THERM_INT_HIGH_ENABLE bit. [ hpa: fixed up against current tip:x86/cpu ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> LKML-Reference: <1280448826-12004-2-git-send-email-fenghua.yu@intel.com> Reviewed-by: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
8c06585d |
|
17-Jul-2010 |
Brian Gerst <brgerst@gmail.com> |
x86: Remove redundant K6 MSRs MSR_K6_EFER is unused, and MSR_K6_STAR is redundant with MSR_STAR. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-1-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
23016bf0 |
|
03-Jun-2010 |
Venkatesh Pallipadi <venki@google.com> |
x86: Look for IA32_ENERGY_PERF_BIAS support The new IA32_ENERGY_PERF_BIAS MSR allows system software to give hardware a hint whether OS policy favors more power saving, or more performance. This allows the OS to have some influence on internal hardware power/performance tradeoffs where the OS has previously had no influence. The support for this feature is indicated by CPUID.06H.ECX.bit3, as documented in the Intel Architectures Software Developer's Manual. This patch discovers support of this feature and displays it as "epb" in /proc/cpuinfo. Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <alpine.LFD.2.00.1006032310160.6669@localhost.localdomain> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
1f9a0bd4 |
|
08-Jun-2010 |
Huang Ying <ying.huang@intel.com> |
x86, mce: Rename MSR_IA32_MCx_CTL2 value Rename CMCI_EN to MCI_CTL2_CMCI_EN and CMCI_THRESHOLD_MASK to MCI_CTL2_CMCI_THRESHOLD_MASK to make naming consistent. Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1275977348.3444.659.camel@yhuang-dev.sh.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
67ec6607 |
|
17-May-2010 |
Joerg Roedel <joerg.roedel@amd.com> |
KVM: SVM: Implement workaround for Erratum 383 This patch implements a workaround for AMD erratum 383 into KVM. Without this erratum fix it is possible for a guest to kill the host machine. This patch implements the suggested workaround for hypervisors which will be published by the next revision guide update. [jan: fix overflow warning on i386] [xiao: fix unused variable warning] Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
a321cedb |
|
24-May-2010 |
Carsten Emde <C.Emde@osadl.org> |
drivers/hwmon/coretemp.c: get TjMax value from MSR The MSR IA32_TEMPERATURE_TARGET contains the TjMax value in the newer Intel processors. Signed-off-by: Huaxu Wan <huaxu.wan@linux.intel.com> Signed-off-by: Carsten Emde <C.Emde@osadl.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Yong Wang <yong.y.wang@linux.intel.com> Cc: Rudolf Marek <r.marek@assembler.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
cafd6659 |
|
28-Apr-2010 |
Shane Wang <shane.wang@intel.com> |
KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) Per document, for feature control MSR: Bit 1 enables VMXON in SMX operation. If the bit is clear, execution of VMXON in SMX operation causes a general-protection exception. Bit 2 enables VMXON outside SMX operation. If the bit is clear, execution of VMXON outside SMX operation causes a general-protection exception. This patch is to enable this kind of check with SMX for VMXON in KVM. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7c5ecaf7 |
|
25-Mar-2010 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf, x86: Clean up debugctlmsr bit definitions Move all debugctlmsr thingies into msr-index.h Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100325135413.861425293@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
035a02c1 |
|
18-Mar-2010 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86, amd: Restrict usage of c1e_idle() Currently c1e_idle returns true for all CPUs greater than or equal to family 0xf model 0x40. This covers too many CPUs. Meanwhile a respective erratum for the underlying problem was filed (#400). This patch adds the logic to check whether erratum #400 applies to a given CPU. Especially for CPUs where SMI/HW triggered C1e is not supported, c1e_idle() doesn't need to be used. We can check this by looking at the respective OSVW bit for erratum #400. Cc: <stable@kernel.org> # .32.x .33.x Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100319110922.GA19614@alberich.amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
cb7d6b50 |
|
18-Mar-2010 |
Lin Ming <ming.m.lin@intel.com> |
perf, x86: Add cache events for the Pentium-4 PMU Move the HT bit setting code from p4_pmu_event_map to p4_hw_config. So the cache events can get HT bit set correctly. Tested on my P4 desktop, below 6 cache events work: L1-dcache-load-misses LLC-load-misses dTLB-load-misses dTLB-store-misses iTLB-loads iTLB-load-misses Signed-off-by: Lin Ming <ming.m.lin@intel.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1268908392.13901.128.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9d260ebc |
|
16-Dec-2009 |
Andreas Herrmann <herrmann.der.user@googlemail.com> |
x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space Use NodeId MSR to get NodeId and number of nodes per processor. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091216144355.GB28798@alberich.amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
5df97400 |
|
15-Dec-2009 |
Sheng Yang <sheng@linux.intel.com> |
x86: Add IA32_TSC_AUX MSR and use it Clean up write_tsc() and write_tscp_aux() by replacing hardcoded values. No change in functionality. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
0367b433 |
|
15-Jun-2009 |
Alexander Graf <agraf@suse.de> |
x86: Add definition for IGNNE MSR Hyper-V accesses MSR_IGNNE while running under KVM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
f3a0867b |
|
28-Jul-2009 |
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> |
x86, mce: fix reporting of Thermal Monitoring mechanism enabled Early Pentium M models use different method for enabling TM2 (per paragraph 13.5.2.3 of the "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1"). Tested on the affected Pentium M variant (model == 13). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
a2d32bcb |
|
08-Jul-2009 |
Andi Kleen <andi@firstfloor.org> |
x86: mce: macros to compute banks MSRs Instead of open coded calculations for bank MSRs hide the indexing of higher banks MCE register MSRs in new macros. No semantic changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
44973998 |
|
01-Jul-2009 |
Jaswinder Singh Rajput <jaswinder@kernel.org> |
x86: Remove double declaration of MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 is already declared in msr-index.h. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> LKML-Reference: <1246450778.6940.8.camel@hpdv5.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
ba2d0f2b |
|
07-Apr-2009 |
Thomas Gleixner <tglx@linutronix.de> |
x86, mce: Cleanup symbols in intel thermal codes Decode magic constants and turn them into symbols. [ Cleanup to use symbols already exists - HS ] [ Impact: cleanup ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
bf8b9a63 |
|
08-May-2009 |
Jaswinder Singh Rajput <jaswinder@kernel.org> |
x86: msr-index.h remove duplicate MSR C001_0015 declaration MSRC001_0015 Hardware Configuration Register (HWCR) is already defined as MSR_K7_HWCR. And HWCR is available for >= K7. So MSR_K8_HWCR is not required and no-one is using it. [ Impact: cleanup, no object code change ] Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
|
#
d2062693 |
|
02-Feb-2009 |
Alexander Graf <agraf@suse.de> |
x86: Add EFER descriptions for FFXSR AMD k10 includes support for the FFXSR feature, which leaves out XMM registers on FXSAVE/FXSAVE when the EFER_FFXSR bit is set in EFER. The CPUID feature bit exists already, but the EFER bit is missing currently, so this patch adds it to the list of known EFER bits. Signed-off-by: Alexander Graf <agraf@suse.de> CC: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
9962d032 |
|
25-Nov-2008 |
Alexander Graf <agraf@suse.de> |
KVM: SVM: Move EFER and MSR constants to generic x86 code MSR_EFER_SVME_MASK, MSR_VM_CR and MSR_VM_HSAVE_PA are set in KVM specific headers. Linux does have nice header files to collect EFER bits and MSR IDs, so IMHO we should put them there. While at it, I also changed the naming scheme to match that of the other defines. (introduced in v6) Acked-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
03195c6b |
|
12-Feb-2009 |
Andi Kleen <andi@firstfloor.org> |
x86, mce, cmci: define MSR names and fields for new CMCI registers Impact: New register definitions only CMCI means support for raising an interrupt on a corrected machine check event instead of having to poll for it. It's a new feature in Intel Nehalem CPUs available on some machine check banks. For details see the IA32 SDM Vol3a 14.5 Define the registers for it as a preparation for further patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
bdf21a49 |
|
21-Jan-2009 |
H. Peter Anvin <hpa@linux.intel.com> |
x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h> Impact: None (new bit definitions currently unused) Add bit definitions for the MSR_IA32_MISC_ENABLE MSRs to <asm/msr-index.h>. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
29d0887f |
|
16-Dec-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: microcode_amd: replace inline asm by common rdmsr/wrmsr functions Impact: cleanup Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
1965aae3 |
|
22-Oct-2008 |
H. Peter Anvin <hpa@zytor.com> |
x86: Fix ASM_X86__ header guards Change header guards named "ASM_X86__*" to "_ASM_X86_*" since: a. the double underscore is ugly and pointless. b. no leading underscore violates namespace constraints. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
#
bb898558 |
|
17-Aug-2008 |
Al Viro <viro@zeniv.linux.org.uk> |
x86, um: ... and asm-x86 move Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|