#
08a82824 |
|
08-Mar-2024 |
Sean Christopherson <seanjc@google.com> |
KVM: selftests: Verify post-RESET value of PERF_GLOBAL_CTRL in PMCs test Add a guest assert in the PMU counters test to verify that KVM stuffs the vCPU's post-RESET value to globally enable all general purpose counters. Per Intel's SDM, IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits. and Where "n" is the number of general-purpose counters available in the processor. For the edge case where there are zero GP counters, follow the spirit of the architecture, not the SDM's literal wording, which doesn't account for this possibility and would require the CPU to set _all_ bits in PERF_GLOBAL_CTRL. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240309013641.1413400-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
4a447b13 |
|
17-Feb-2024 |
Dapeng Mi <dapeng1.mi@linux.intel.com> |
KVM: selftests: Test top-down slots event in x86's pmu_counters_test Although the fixed counter 3 and its exclusive pseudo slots event are not supported by KVM yet, the architectural slots event is supported by KVM and can be programmed on any GP counter. Thus add validation for this architectural slots event. Top-down slots event "counts the total number of available slots for an unhalted logical processor, and increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method." As for the slot, it's an abstract concept which indicates how many uops (decoded from instructions) can be processed simultaneously (per cycle) on HW. In Top-down Microarchitecture Analysis (TMA) method, the processor is divided into two parts, frond-end and back-end. Assume there is a processor with classic 5-stage pipeline, fetch, decode, execute, memory access and register writeback. The former 2 stages (fetch/decode) are classified to frond-end and the latter 3 stages are classified to back-end. In modern Intel processors, a complicated instruction would be decoded into several uops (micro-operations) and so these uops can be processed simultaneously and then improve the performance. Thus, assume a processor can decode and dispatch 4 uops in front-end and execute 4 uops in back-end simultaneously (per-cycle), so the machine-width of this processor is 4 and this processor has 4 topdown slots per-cycle. If a slot is spare and can be used to process a new upcoming uop, then the slot is available, but if a uop occupies a slot for several cycles and can't be retired (maybe blocked by memory access), then this slot is stall and unavailable. Considering the testing instruction sequence can't be macro-fused on x86 platforms, the measured slots count should not be less than NUM_INSNS_RETIRED. Thus assert the slots count against NUM_INSNS_RETIRED. pmu_counters_test passed with this patch on Intel Sapphire Rapids. About the more information about TMA method, please refer the below link. https://www.intel.com/content/www/us/en/docs/vtune-profiler/cookbook/2023-0/top-down-microarchitecture-analysis-method.html Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240218043003.2424683-1-dapeng1.mi@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
a8a37f55 |
|
09-Jan-2024 |
Sean Christopherson <seanjc@google.com> |
KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSR Extend the read/write PMU counters subtest to verify that RDPMC also reads back the written value. Opportunsitically verify that attempting to use the "fast" mode of RDPMC fails, as the "fast" flag is only supported by non-architectural PMUs, which KVM doesn't virtualize. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-30-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
cd34fd8c |
|
09-Jan-2024 |
Sean Christopherson <seanjc@google.com> |
KVM: selftests: Test PMC virtualization with forced emulation Extend the PMC counters test to use forced emulation to verify that KVM emulates counter events for instructions retired and branches retired. Force emulation for only a subset of the measured code to test that KVM does the right thing when mixing perf events with emulated events. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-27-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
c85e9867 |
|
09-Jan-2024 |
Sean Christopherson <seanjc@google.com> |
KVM: selftests: Add a helper to query if the PMU module param is enabled Add a helper to probe KVM's "enable_pmu" param, open coding strings in multiple places is just asking for false negatives and/or runtime errors due to typos. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
b55e7adf |
|
09-Jan-2024 |
Sean Christopherson <seanjc@google.com> |
KVM: selftests: Expand PMU counters test to verify LLC events Expand the PMU counters test to verify that LLC references and misses have non-zero counts when the code being executed while the LLC event(s) is active is evicted via CFLUSH{,OPT}. Note, CLFLUSH{,OPT} requires a fence of some kind to ensure the cache lines are flushed before execution continues. Use MFENCE for simplicity (performance is not a concern). Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-22-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
787071fd |
|
09-Jan-2024 |
Jinrong Liang <cloudliang@tencent.com> |
KVM: selftests: Add functional test for Intel's fixed PMU counters Extend the fixed counters test to verify that supported counters can actually be enabled in the control MSRs, that unsupported counters cannot, and that enabled counters actually count. Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> [sean: fold into the rd/wr access test, massage changelog] Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
c7d7c76e |
|
09-Jan-2024 |
Jinrong Liang <cloudliang@tencent.com> |
KVM: selftests: Test consistency of CPUID with num of fixed counters Extend the PMU counters test to verify KVM emulation of fixed counters in addition to general purpose counters. Fixed counters add an extra wrinkle in the form of an extra supported bitmask. Thus quoth the SDM: fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i) Test that KVM handles a counter being available through either method. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
7137cf75 |
|
09-Jan-2024 |
Jinrong Liang <cloudliang@tencent.com> |
KVM: selftests: Test consistency of CPUID with num of gp counters Add a test to verify that KVM correctly emulates MSR-based accesses to general purpose counters based on guest CPUID, e.g. that accesses to non-existent counters #GP and accesses to existent counters succeed. Note, for compatibility reasons, KVM does not emulate #GP when MSR_P6_PERFCTR[0|1] is not present (writes should be dropped). Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
3e26b825 |
|
09-Jan-2024 |
Jinrong Liang <cloudliang@tencent.com> |
KVM: selftests: Test Intel PMU architectural events on fixed counters Extend the PMU counters test to validate architectural events using fixed counters. The core logic is largely the same, the biggest difference being that if a fixed counter exists, its associated event is available (the SDM doesn't explicitly state this to be true, but it's KVM's ABI and letting software program a fixed counter that doesn't actually count would be quite bizarre). Note, fixed counters rely on PERF_GLOBAL_CTRL. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|
#
4f1bd6b1 |
|
09-Jan-2024 |
Jinrong Liang <cloudliang@tencent.com> |
KVM: selftests: Test Intel PMU architectural events on gp counters Add test cases to verify that Intel's Architectural PMU events work as expected when they are available according to guest CPUID. Iterate over a range of sane PMU versions, with and without full-width writes enabled, and over interesting combinations of lengths/masks for the bit vector that enumerates unavailable events. Test up to vPMU version 5, i.e. the current architectural max. KVM only officially supports up to version 2, but the behavior of the counters is backwards compatible, i.e. KVM shouldn't do something completely different for a higher, architecturally-defined vPMU version. Verify KVM behavior against the effective vPMU version, e.g. advertising vPMU 5 when KVM only supports vPMU 2 shouldn't magically unlock vPMU 5 features. According to Intel SDM, the number of architectural events is reported through CPUID.0AH:EAX[31:24] and the architectural event x is supported if EBX[x]=0 && EAX[31:24]>x. Handcode the entirety of the measured section so that the test can precisely assert on the number of instructions and branches retired. Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
|