355094 |
25-Nov-2019 |
kib |
MFC r354828: Add x86 msr tweak KPI. |
350353 |
26-Jul-2019 |
kib |
MFC r348544: hwpmc_intel: List all Silvermont ids.
PR: 238310 |
345326 |
20-Mar-2019 |
kib |
MFC r345078: hwpmc/core: Adopt to upcoming Skylake TSX errata. |
345197 |
15-Mar-2019 |
kib |
MFC r345074: Remove useless version check. |
343350 |
23-Jan-2019 |
markj |
MFC r343265: hwpmc: Plug memory disclosures from PMC_OP_{GETPMCINFO,GETCPUINFO}.
admbugs: 765 |
339771 |
26-Oct-2018 |
mmacy |
fix up more issues introduced by failing to have run TB before r339767 |
339769 |
26-Oct-2018 |
mmacy |
fix i386 breakage caused by r339767 |
339767 |
26-Oct-2018 |
mmacy |
hwpmc: Enable hwpmc support for AMD Family 17H devices
Adds new counters and events for family 17H devices. Adds libpmc support for family 17H devices.
Direct commit to 11 as this is supported by way of JSON counter descriptions on 12 & HEAD.
Submitted by: Girish Nandibasappa Differential Revision: https://reviews.freebsd.org/D17464 |
331722 |
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re) |
331319 |
21-Mar-2018 |
kib |
MFC r328087 (by fabient): Fix pmcstat exit from kernel introduced by r325275.
PR: 223689 |
330897 |
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg |
326009 |
20-Nov-2017 |
kib |
MFC r325759: Do not leak PMC_PO_OWNS_LOGFILE on error. |
325890 |
16-Nov-2017 |
kib |
MFC r325758: Style bug. |
325756 |
13-Nov-2017 |
kib |
MFC r325671: Check that the pmc index is less than the number of hardware PMCs, instead of asserting the condition. |
325551 |
08-Nov-2017 |
kib |
MFC r325277: Do not run pmclog_configure_log() without pmc_sx protection. |
325550 |
08-Nov-2017 |
kib |
MFC r325276: Be protective and check the po_file validity before dropping the ref. |
325549 |
08-Nov-2017 |
kib |
MFC r325275: In hwpmc, do not double-close the logging file. |
325548 |
08-Nov-2017 |
kib |
MFC r325274: There is no use for dropping Giant in the pmc syscall. |
325547 |
08-Nov-2017 |
kib |
MFC r325273: Minor style tweaks. |
325546 |
08-Nov-2017 |
kib |
MFC r325271: Use designated initializers for pmc sysent and module data. |
323799 |
20-Sep-2017 |
kib |
MFC r323230: Skylake server core PMC support for hwpmc(4). |
322532 |
15-Aug-2017 |
kib |
MFC r322256: Fix logic error in the the assert, causing the condition to be always true.
PR: 217741 |
311960 |
12-Jan-2017 |
gnn |
MFC 311224
Fix PMC architecture check to handle later IPAs including Skylake Tested with tools/test/hwpmc/pmctest.py
Obtained from: Oliver Pinter |
310064 |
14-Dec-2016 |
avg |
MFC r308480: pmc_process_csw_out: ignore deleted counters |
308758 |
17-Nov-2016 |
avg |
MFC r308101: hwpmc: fix a race between amd_stop_pmc and amd_intr |
305675 |
09-Sep-2016 |
jhb |
MFC 303720: Apply the fix from r232612 to fixed function counters. |
302408 |
08-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
300902 |
28-May-2016 |
andrew |
Don't panic in hwpmc when stopping sampling.
When hwpmc stops sampling it will set the pm_state to something other than PMC_STATE_RUNNING. This means the following sequence can happen:
CPU 0: Enter the interrupt handler CPU 0: Set the thread TDP_CALLCHAIN pflag CPU 1: Stop sampling CPU 0: Call pmc_process_samples, sampling is stopped so clears ps_nsamples CPU 0: Finishes interrupt processing with the TDP_CALLCHAIN flag set CPU 0: Call pmc_capture_user_callchain to capture the user call chain CPU 0: Find all the pmc sample are free so no call chains need to be captured CPU 0: KASSERT because of this
This fixes the issue by checking if any of the samples have been stopped and including this in te KASSERT.
PR: 204273 Reviewed by: bz, gnn Obtained from: ABT Systems Ltd Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6581
|
299746 |
14-May-2016 |
jhb |
Add an EARLY_AP_STARTUP option to start APs earlier during boot.
Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot.
This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed.
This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs.
However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu.
Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case.
As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code.
These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).
PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix
|
299353 |
10-May-2016 |
trasz |
Remove misc NULL checks after M_WAITOK allocations.
MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
298955 |
03-May-2016 |
pfg |
sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
|
298931 |
02-May-2016 |
pfg |
etc: minor spelling fixes.
Mostly comments but also some user-visible strings.
MFC after: 2 weeks
|
298431 |
21-Apr-2016 |
pfg |
sys: use our nitems() macro when param.h is available.
This should cover all the remaining cases in the kernel.
Discussed in: freebsd-current
|
298411 |
21-Apr-2016 |
pfg |
Remove slightly used const values that can be replaced with nitems().
Suggested by: jhb
|
298365 |
20-Apr-2016 |
pfg |
Remove unused e500_event_codes_size.
Found by: jhb
|
297793 |
10-Apr-2016 |
pfg |
Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
|
297730 |
09-Apr-2016 |
jhibbits |
Fix a masking bug for e500 PMC.
No idea how this slipped through my regression testing. pe_code is the event to count, pe_cpu is the CPU family mask.
|
295560 |
12-Feb-2016 |
kib |
If full width writes to the performance monitoring counters are supported, use full-width aliases MSRs for writes. This fixes the "[pmc,X] negative increment" assertion on the context switch when clipped counter value is sign-extended.
Add definitions for the MSR IA32_PERF_CAPABILITIES needed to detect the feature.
PR: 207068 Submitted by: joss.upton@yahoo.com MFC after: 2 weeks
|
295558 |
12-Feb-2016 |
kib |
Remove tautological cast.
PR: 207068 Submitted by: joss.upton@yahoo.com MFC after: 2 weeks
|
295435 |
09-Feb-2016 |
kib |
Rename P_KTHREAD struct proc p_flag to P_KPROC.
I left as is an apparent bug in ntoskrnl_var.h:AT_PASSIVE_LEVEL() definition.
Suggested by: jhb Sponsored by: The FreeBSD Foundation
|
295352 |
06-Feb-2016 |
kib |
Do not call vn_fullpath(9) (through the pmc_getfilename() wrapper) when its result is immediately ignored, i.e. for kernel processes forked from the user process. Do not test for non-null before freeing string.
Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
295041 |
29-Jan-2016 |
br |
Welcome the RISC-V 64-bit kernel.
This is the final step required allowing to compile and to run RISC-V kernel and userland from HEAD.
RISC-V is a completely open ISA that is freely available to academia and industry.
Thanks to all the people involved! Special thanks to Andrew Turner, David Chisnall, Ed Maste, Konstantin Belousov, John Baldwin and Arun Thomas for their help. Thanks to Robert Watson for organizing this project.
This project sponsored by UK Higher Education Innovation Fund (HEIF5) and DARPA CTSRD project at the University of Cambridge Computer Laboratory.
FreeBSD/RISC-V project home: https://wiki.freebsd.org/riscv
Reviewed by: andrew, emaste, kib Relnotes: Yes Sponsored by: DARPA, AFRL Sponsored by: HEIF5 Differential Revision: https://reviews.freebsd.org/D4982
|
294197 |
17-Jan-2016 |
jhibbits |
e5500 HWPMC is identical to e500mc, so add support check for it.
|
292070 |
11-Dec-2015 |
rrs |
More fixes in the various intel processors, fixing missing IAP_F_FM's as well as incorrect umask specifications for some of the new Broadwell/Skylake PMC's. Also silvermont had a *lot* of missing IAP_F_FM.
Sponsored by: Netflix Inc.
|
292033 |
09-Dec-2015 |
rrs |
Fix the tunable in logging so that if its pre-11 we have the proper line so the tunable is present.
Sponsored by: Netflix Inc.
|
291494 |
30-Nov-2015 |
rrs |
Add support for Intel Skylake and Intel Broadwell PMC's. The Broadwell PMC's have been tested on the Broadwell-Xeon with a hacked up version of pmcstudy -T. I still need to circle back and add in to pmcstudy all the new tests from the Broadwell Vtune guide (for the hacked up version I just made it so I could run the -T option). The Skylake CPU is not yet available (even though Intel is advertising it .. imagine that). The Skylake PMC's will need to be tested once we can get a sample skylake CPU :-)
Sponsored by: Netflix Inc.
|
290930 |
16-Nov-2015 |
jtl |
Improve accuracy of PMC sampling frequency
The code tracks a counter which is the number of events until the next sample. On context switch in, it loads the saved counter. On context switch out, it tries to calculate a new saved counter.
Problems:
1. The saved counter was shared by all threads in a process. However, this means that all threads would be initially loaded with the same saved counter. However, that could result in sampling more often than once every X number of events.
2. The calculation to determine a new saved counter was backwards. It added when it should have subtracted, and subtracted when it should have added. Assume a single-threaded process with a reload count of 1000 events. Assuming the counter on context switch in was 100 and the counter on context switch out was 50 (meaning the thread has "consumed" 50 more events), the code would calculate a new saved counter of 150 (instead of the proper 50).
Fix:
1. As soon as the saved counter is used to initialize a monitor for a thread on context switch in, set the saved counter to the reload count. That way, subsequent threads to use the saved counter will get the full reload count, assuring we sample at least once every X number of events (across all threads).
2. Change the calculation of the saved counter. Due to the change to the saved counter in #1, we simply need to add (modulo the reload count) the remaining counter time we retrieve from the CPU when a thread is context switched out.
Differential Revision: https://reviews.freebsd.org/D4122 Approved by: gnn (mentor) MFC after: 1 month Sponsored by: Juniper Networks
|
290813 |
14-Nov-2015 |
jtl |
Optimizations to the way hwpmc gathers user callchains
Changes to the code to gather user stacks: * Delay setting pmc_cpumask until we actually have the stack. * When recording user stack traces, only walk the portion of the ring that should have samples for us.
Sponsored by: Juniper Networks Approved by: gnn (mentor) MFC after: 1 month
|
290811 |
14-Nov-2015 |
jtl |
Fix hwpmc "stalled" behavior
Currently, there is a single pm_stalled flag that tracks whether a performance monitor was "stalled" due to insufficent ring buffer space for samples. However, because the same performance monitor can run on multiple processes or threads at the same time, a single pm_stalled flag that impacts them all seems insufficient.
In particular, you can hit corner cases where the code fails to stop performance monitors during a context switch out, because it thinks the performance monitor is already stopped. However, in reality, it may be that only the monitor running on a different CPU was stalled.
This patch attempts to fix that behavior by tracking on a per-CPU basis whether a PM desires to run and whether it is "stalled". This lets the code make better decisions about when to stop PMs and when to try to restart them. Ideally, we should avoid the case where the code fails to stop a PM during a context switch out.
Sponsored by: Juniper Networks Reviewed by: jhb Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D4124
|
289320 |
14-Oct-2015 |
bz |
Now that we can detect the Cortex-A8 properly, fix the event list according to the Cortex-A8 TRM r3p2 section 3.2.49. The A8 list differs from the "ARM-v7 common" list, given the A8 was an earlier model.
There is still more work to be done for other Cortex-Ax version as andrew points out, but I am just trying to fix A8 for now for teaching.
MFC after: 2 weeks Sponsored by: DARPA/AFRL Obtained from: Cambridge/L41 Reviewed by: andrew Differential Revision: https://reviews.freebsd.org/D3876
|
287115 |
24-Aug-2015 |
bz |
When forking a child process with PMC_F_DESCENDANTS set in pmc_attach() in the parent, we will inherit the pmcids but cannot execute any operations on them in the child. The reason for this is that pmc_find_pmc() only tries to find the current process on the owners hash list, but given the child does not own the attachment, we cannot find it. Thus, in case the initial lookup fails, try to find the pmc_process state affiliated with the child process, lookup the pmc from there using the row index, and get the owner process from that pmc. Then continue as normal and lookup the pmc context of the owner (process).
This allows us to call, e.g., pmc_start() in the child process before we start the work there, but to collect the accumulated results later in the parent.
Sponsored by: DARPA,AFRL Obtained from: L41 Tested by: rwatson, L41 MFC after: 4 weeks Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D2052
|
284218 |
10-Jun-2015 |
br |
o Rework ARMv7 events list using aliases - same way as we have for arm64. o Extend it with Cortex A9-specific events.
|
283924 |
02-Jun-2015 |
vangyzen |
Provide vnode in memory map info for files on tmpfs
When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior.
This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).
Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431 MFC after: 2 weeks Reviewed by: jhb Approved by: kib (mentor)
|
283123 |
19-May-2015 |
jhb |
Fix two bugs that could result in PMC sampling effectively stopping. In both cases, the the effect of the bug was that a very small positive number was written to the counter. This means that a large number of events needed to occur before the next sampling interrupt would trigger. Even with very frequently occurring events like clock cycles wrapping all the way around could take a long time. Both bugs occurred when updating the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count against the count set when the thread switched in and generates a delta to apply to the saved count. If this delta causes the reload counter to go negative, it would add a full reload interval to wrap it around to a positive value. The fix is to add the full reload interval if the resulting counter is zero.
Second, occasionally the raw counter value read during a context switch has actually wrapped, but an interrupt has not yet triggered. In this case the existing logic would return a very large reload count (e.g. 2^48 - 2 if the counter had overflowed by a count of 2). This was seen both for fixed-function and programmable counters on an E5-2643. Workaround this case by returning a reload count of zero.
PR: 198149 Differential Revision: https://reviews.freebsd.org/D2557 Reviewed by: emaste MFC after: 1 week Sponsored by: Norse Corp, Inc.
|
283121 |
19-May-2015 |
jhb |
Use the proper mask when reloading sampling PMCs for Core CPUs.
Differential Revision: https://reviews.freebsd.org/D2492 Reviewed by: emaste MFC after: 1 month
|
283120 |
19-May-2015 |
jhb |
Use fixed enum values for PMC_CLASSES().
This removes one of the frequent causes of ABI breakage when new CPU types are added to hwpmc(4).
Differential Revision: https://reviews.freebsd.org/D2586 Reviewed by: davide, emaste, gnn (earlier version) MFC after: 2 weeks
|
283112 |
19-May-2015 |
br |
Add Performance Monitoring Counters support for AArch64. Family-common and CPU-specific counters implemented.
Supported CPUs: ARM Cortex A53/57/72.
Reviewed by: andrew, bz, emaste, gnn, jhb Sponsored by: ARM Limited Differential Revision: https://reviews.freebsd.org/D2555
|
282676 |
09-May-2015 |
bz |
Convert remaining hwpmc(4) debug printfs over to KTR to unbreak the build for at least powerpc kernels. Missed in r282658.
MFC after: 10 days
|
282658 |
08-May-2015 |
jhb |
Convert hwpmc(4) debug printfs over to KTR.
Differential Revision: https://reviews.freebsd.org/D2487 Reviewed by: davide, emaste MFC after: 2 weeks Sponsored by: Norse Corp, Inc.
|
282641 |
08-May-2015 |
jhb |
Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of the broader DEBUG option.
Reviewed by: emaste MFC after: 2 weeks Sponsored by: Norse Corp, Inc.
|
281713 |
18-Apr-2015 |
jhibbits |
Implement hwpmc(4) for Freescale e500 core.
This supports e500v1, e500v2, and e500mc. Tested only on e500v2, but the performance counters are identical across all, with e500mc having some additional events.
Relnotes: Yes
|
281102 |
05-Apr-2015 |
rpaulo |
hwpmc: add initial Intel Broadwell support.
The full list of aliases and events will follow in a subsequent commit.
MFC after: 1 month
|
281101 |
05-Apr-2015 |
rpaulo |
Remove whitespace.
|
281098 |
05-Apr-2015 |
adrian |
Add support for the MIPS74K SoC family performance counters events.
These are similar to the mips24k performance counters - some are available on perfcnt0/3, some are available on perfcnt1/4. However, the events aren't all the same.
* Add the events, named the same as from Linux oprofile. * Verify they're the same as "MIPS32(R) 74KTM Processor Core Family Software User's Manual"; Document Number: MD00519; Revision 01.05. * Rename INSTRUCTIONS to something else, so it doesn't clash with the alias INSTRUCTIONS. I'll try to tidy this up later; there are a few other aliases to add and shuffle around.
Tested:
* QCA9558 SoC (AP135 board) - MIPS74Kc core (no FPU.) * make universe; where it didn't fail for other reasons.
TODO:
* It'd be nice to support the four performance counters in at least this hardware, rather than just two.
Reviewed by: bsdimp ("looks good; don't break world".)
|
280790 |
28-Mar-2015 |
bz |
Remove all the handcrafted assembly in hwpmc_armv7.c and use the common (autogenerated) versions. Removes extra vertical space, and makes it easier to grep for usage throughout the tree. Conditionally compile only for arm6 [1] (yes sounds odd but is right).
Submitted by: andrew [1] Reviewed by: gnn, andrew (ian earlier version I think) Differential Revision: https://reviews.freebsd.org/D2159 Obtained from: Cambridge/L41 Sponsored by: DARPA, AFRL
|
280737 |
27-Mar-2015 |
bz |
Rather than defining our own magic checks here use INKERNEL() for the PMC_IN_KERNEL() macro definition.
Add missing macros to extract the return address (LR) from the trapframe.
Discussed with: andrew Obtained from: Cambridge/L41 Sponsored by: DARPA, AFRL MFC after: 2 weeks
|
279939 |
12-Mar-2015 |
rstone |
hwpmc: Fix event number to match enum name
Differential revision: https://reviews.freebsd.org/D1592 Reviewed by: Joseph Kong MFC after: 1 month
|
279894 |
11-Mar-2015 |
rrs |
You need to have the capabilities and not skip it if you are not on head.. otherwise the file pointer will be NULL and when you try to do something with it you will crash. Make the #else be the old capabilites, and then remove the erroneous ifdefs for 11.
MFC after: 1 week (with the other MFC I was going to do until the panic)
|
279836 |
10-Mar-2015 |
rstone |
Add missing counter definitions
Differential Revision: https://reviews.freebsd.org/D1591 MFC after: 1 month Sponsored by: Sandvine Inc
|
279835 |
10-Mar-2015 |
rstone |
Fix Ivy Bridge+ MEM_UOPS_RETIRED counters
The MEM_UOPS_RETIRED actually work the same way as the Sandy Bridge counters, but the counters were documented in a different way and that seemed to cause the Ivy Bridge counters to be implemented incorrectly. Use the same counter definitions as Sandy Bridge. While I'm here, rename the counters to match what's documented in the datasheet.
Differential Revision: https://reviews.freebsd.org/D1590 MFC after: 1 month Sponsored by: Sandvine Inc.
|
279834 |
10-Mar-2015 |
rstone |
Support architectural events on Haswell/Ivy Bridge
Differential Revision: https://reviews.freebsd.org/D1589 MFC after: 1 month Sponsored by: Sandvine Inc
|
279832 |
10-Mar-2015 |
rstone |
Fix Sandy Bridge+ hwpmc branch counters
On Sandy Bridge and later, to count branch-related events you have to or together a mask indicating the type of branch instruction to count (e.g. direct jump, branch, etc) and a bits indicating whether to count taken and not-taken branches. The current counter definitions where defining this bits individually, so the counters never worked and always just counted 0.
Fix the counter definitions to instead contain the proper combination of masks. Also update the man pages to reflect the new counters.
Differential Revision: https://reviews.freebsd.org/D1587 MFC after: 1 month Sponsored by: Sandvine Inc.
|
279831 |
10-Mar-2015 |
rstone |
Fix pmc unit restrictions to match documentation
A couple of pmc counters did not work because there were being restricted to the wrong PMC unit. I've verified that these counters now work and match the documented restrictions.
Differential Revision: https://reviews.freebsd.org/D1586 MFC after: 1 month Sponsored by: Sandvine Inc
|
279830 |
10-Mar-2015 |
rstone |
Fix various bugs in Haswell counter definitions
1) The "WALK_COMPLETED_2M_4M" event incorrectly referenced 4K pages. 2) The umask for RING0 and RING123 events was reversed.
Differential Revision: https://reviews.freebsd.org/D1585 MFC after: 1 month Sponsored by: Sandvine Inc
|
278577 |
11-Feb-2015 |
andrew |
The cpu_id macro was renamed in r278529, catch up with this new name.
|
277835 |
28-Jan-2015 |
br |
Add ARMv7 performance monitoring counters.
Differential Revision: https://reviews.freebsd.org/D1687 Reviewed by: rpaulo Sponsored by: DARPA, AFRL
|
277524 |
22-Jan-2015 |
rstone |
style(9) cleanup
|
277177 |
14-Jan-2015 |
rrs |
Update the hwpmc driver to have the new type HASWELL_XEON. Also go back through HASWELL, IVY_BRIDGE, IVY_BRIDGE_XEON and SANDY_BRIDGE to straighten out all the missing PMCs. We also add a new pmc tool pmcstudy, this allows one to run the various formulas from the documents "Using Intel Vtune Amplifier XE on XXX Generation platforms" for IB/SB and Haswell. The tool also allows one to postulate your own formulas with any of the various PMC's. At some point I will enahance this to work with Brendan Gregg's flame-graphs so we can flamegraph various PMC interactions. Note the manual page also needs some work (lots of work) but gnn has committed to help me with that ;-) Reviewed by: gnn MFC after:1 month Sponsored by: Netflix Inc.
|
275190 |
27-Nov-2014 |
jhibbits |
Fix hwpmc sampling for ppc970 (G5-class) processors.
With this, hwpmc sampling now works on these processors.
MFC after: 3 weeks Relnotes: yes
|
275171 |
27-Nov-2014 |
jhibbits |
Fix hwpmc sampling for MPC74xxx (G4) processors.
With this, hwpmc sampling now works correctly on these processors.
MFC after: 3 weeks Relnotes: yes
|
274766 |
20-Nov-2014 |
emaste |
Clamp too-large hwpmc callchaindepth to the maximum
If the depth requested by the user is too large, it's better to provide the maximum than the smaller default.
Sponsored by: The FreeBSD Foundation
|
273953 |
01-Nov-2014 |
mjg |
Fix up module unload for syscall_module_handler consumers.
After r273707 it was registering syscalls as static.
This fixes hwpmc module unload.
Reported by: markj
|
273236 |
17-Oct-2014 |
markj |
Use pmc_destroy_pmc_descriptor() to actually free the pmc, which is consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy PMCs if a process exits or execs without explicitly releasing them.
Reviewed by: bz, gnn MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D958
|
272713 |
07-Oct-2014 |
bz |
Since introducing the extra mapping in r250103 for architectural performance events we have actually counted 'Branch Instruction Retired' when people asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask mnemonic.
Reviewed by: jimharris Discussed with: gnn, rwatson MFC after: 3 days Sponsored by: DARPA/AFRL
|
271602 |
14-Sep-2014 |
jhibbits |
Fix PowerPC backtraces. Since kernel and user have completely separate address spaces, rather than a split address, we actually can't check for being within the kernel's address range. Instead, do what other backtraces do, and use trapexit()/asttrapexit() as the stack sentinel.
MFC after: 3 weeks
|
268351 |
07-Jul-2014 |
marcel |
Remove ia64.
This includes: o All directories named *ia64* o All files named *ia64* o All ia64-specific code guarded by __ia64__ o All ia64-specific makefile logic o Mention of ia64 in comments and documentation
This excludes: o Everything under contrib/ o Everything under crypto/ o sys/xen/interface o sys/sys/elf_common.h
Discussed at: BSDcan
|
268207 |
03-Jul-2014 |
jhibbits |
Fix a bug in hwpmc(4) callchain retrieval, for both user and kernel.
The array index for the callchain is getting double-incremented -- both in the loop and the storing. It should only be incremented in one location.
Also, constrain the stack pointer range check.
MFC after: 2 weeks
|
267992 |
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
267985 |
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
267961 |
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
267062 |
04-Jun-2014 |
kib |
For Xeon 7500 and 48XX (Nehalem EX and Westmere EX) variants of the Core i7 and Westmere processors, the uncore PMC subsystem is completely different from the uncore PMC on smaller versions of CPUs. Disable existing uncore hwpmc code for EX, otherwise non-existing MSRs are accessed.
The cores PMCs seems to be identical for non-EX and EX, according to the SDM.
Reviewed by: davide, fabient Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
266983 |
02-Jun-2014 |
gnn |
Add missing Ivy Bridge and Haswell events.
Submitted by: Anton Rang <rang@mac.com> MFC: 2 weeks
|
266195 |
15-May-2014 |
markj |
Remove some prototypes for undefined functions.
MFC after: 3 days
|
264635 |
18-Apr-2014 |
jhibbits |
Enable and disable the PMC unit at load/unload time, respectively.
MFC after: 3 weeks
|
263446 |
20-Mar-2014 |
hiren |
Update hwpmc to support core events for Atom Silvermont microarchitecture. (Model 0x4D as per Intel document 330061-001 01/2014)
Tested by: Olivier Cochard-Labbe <olivier@cochatrd.me> MFC after: 4 weeks
|
263233 |
16-Mar-2014 |
rwatson |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h.
MFC after: 3 weeks
|
263112 |
13-Mar-2014 |
eadler |
Fix pointer type in call to malloc
Submitted by: Meyer, Conrad conrad.meyer@isilon.com
|
263111 |
13-Mar-2014 |
eadler |
Fix pointer type in call to malloc
Submitted by: Meyer, Conrad conrad.meyer@isilon.com
|
263080 |
12-Mar-2014 |
kib |
Use correct types for sizeof() in the calculations for the malloc(9) sizes [1]. While there, remove unneeded checks for failed allocations with M_WAITOK flag.
Submitted by: Conrad Meyer <cemeyer@uw.edu> [1] MFC after: 1 week
|
262547 |
27-Feb-2014 |
jhibbits |
Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.
MFC after: 2 weeks
|
261342 |
01-Feb-2014 |
jhibbits |
Add hwpmc(4) support for the PowerPC 970 class processors, direct events. This also fixes asserts on removal of the module for the mpc74xx.
The PowerPC 970 processors have two different types of events: direct events and indirect events. Thus far only direct events are supported. I included some documentation in the driver on how indirect events work, but support is for the future.
MFC after: 1 month
|
261173 |
25-Jan-2014 |
jhibbits |
MPC74xx should not fall through, to the error case.
MFC after: 1 week
|
261087 |
23-Jan-2014 |
jhb |
Move <machine/apicvar.h> to <x86/apicvar.h>.
|
259665 |
20-Dec-2013 |
gnn |
Add another Haswell model (0x45) to the set of supported chips. Model 0x45 appears, for example, in late 2013 Mac Book Pro models and is properly emulated by VMware.
|
259647 |
20-Dec-2013 |
attilio |
o Remove assertions on ipa_version as sometimes the version detection using cpuid can be quirky (this is the case of VMWare without the vPMC support) but fail to probe hwpmc. o Apply the fix for XEON family of processors as established by 315338-020 document (bug AJ85).
Sponsored by: EMC / Isilon storage division Reviewed by: fabient
|
259395 |
14-Dec-2013 |
jhibbits |
Add userland PMC backtracing, and use the PMC trapframe macros for kernel backtraces.
MFC after: 1 week
|
258780 |
30-Nov-2013 |
eadler |
Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this shifts into the sign bit. Instead use (1U << 31) which gets the expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky Reviewed by: cperciva
|
255746 |
20-Sep-2013 |
davide |
Remove local change leftover, this should never have been part of r255745.
Pointy-hat to: davide Approved by: re (implicit)
|
255745 |
20-Sep-2013 |
davide |
Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With current lock classes KPI it was really difficult because there was no way to pass an rmtracker object to the lock/unlock routines. In order to accomplish the task, modify the aforementioned functions so that they can return (or pass as argument) an uinptr_t, which is in the rm case used to hold a pointer to struct rm_priotracker for current thread. As an added bonus, this fixes rm_sleep() in the rm shared case, which right now can communicate priotracker structure between lc_unlock()/lc_lock().
Suggested by: jhb Reviewed by: jhb Approved by: re (delphij)
|
255228 |
05-Sep-2013 |
jhibbits |
Fix the build.
|
255219 |
05-Sep-2013 |
pjd |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; };
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements.
The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
255199 |
04-Sep-2013 |
jhibbits |
Fix hwpmc(4) for 32-bit PowerPC.
|
255164 |
03-Sep-2013 |
jhibbits |
Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoring will likely be done as more drivers are added, since AIM-compatible processors have similar PMC configuration logic.
|
255133 |
01-Sep-2013 |
davide |
Complete r250105. Do not zero fields if M_ZERO flag is specified to malloc(9).
Reported by: pluknet, glebius
|
255022 |
29-Aug-2013 |
adrian |
Remove the duplicate LLC_MISS event and put it in the right order.
|
254855 |
25-Aug-2013 |
adrian |
Update the mis-predicted branch PMC names (for sandy bridge) to not clash.
The SDM (June 2013) tables on these are rather confusing. Yes, they assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes (C5H/00H and C5H/04H.) The latter however is the PEBS version.
So, to make it easier to see the difference - and yes, we can use both without having to actually enable the PEBS specific bits! - just rename the PEBS one to _PS so there's no clashing.
Tested:
* Sandy bridge
|
254850 |
25-Aug-2013 |
adrian |
Fix a >80 character long line, introduced in my previous commit.
Noticed by: hiren
|
254824 |
25-Aug-2013 |
adrian |
Update the MEM_UOP_RETIRED PMC operation for sandy bridge and sandy bridge Xeon.
Summary: These are PEBS events but they're also available as normal counter/sample events. The source table (Table 19-2) lists the base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must be qualified with other values. This particular commit fleshes out those umask values.
Source:
* Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21.
Tested:
* Sandy Bridge (non-Xeon)
|
254813 |
24-Aug-2013 |
markj |
Rename the kld_unload event handler to kld_unload_try, and add a new kld_unload event handler which gets invoked after a linker file has been successfully unloaded. The kld_unload and kld_load event handlers are now invoked with the shared linker lock held, while kld_unload_try is invoked with the lock exclusively held.
Convert hwpmc(4) to use these event handlers instead of having kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are loaded or unloaded. This has no functional effect, but simplifes the linker code somewhat.
Reviewed by: jhb
|
254616 |
21-Aug-2013 |
adrian |
Change the name of this particular event to reflect the name used in Linux and Intel examples.
Sourced:
* https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv * http://software.intel.com/en-us/comment/1747932#comment-1747932
Note:
* It's not currently in the Intel SDM; I need to chase down what's going on.
Tested:
* Sandy Bridge
|
254571 |
20-Aug-2013 |
bz |
Correct a typo in the event mask mnemonic.
Reviewed by: gnn MFC after: 3 days
|
254476 |
18-Aug-2013 |
adrian |
Add in missing events for Sandy Bridge Xeon.
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy bridge Xeon. Right now it only is enabled for Sandy Bridge. * D2/0F is actually a combination rather than a separate counter, so just flip that on for the CPU types that support it.
There's an errata for using this on SB Xeon hardware - I've documented it in kern/181346.
Tested:
* Sandy Bridge * Sandy Bridge Xeon
Sponsored by: Netflix, Inc.
|
251423 |
05-Jun-2013 |
alc |
Relax the vm object locking. Use a read lock.
Sponsored by: EMC / Isilon Storage Division
|
250182 |
02-May-2013 |
davide |
Suppress a GCC warning. This warning is actually bogus and newer GCC versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1) do not whine about it, so, at some point this workaround will be reverted.
Reported by: ache Discussed with: dim
|
250105 |
30-Apr-2013 |
davide |
malloc(9) cannot return NULL if M_WAITOK flag is specified.
|
250103 |
30-Apr-2013 |
davide |
The Intel PMC architectural events have encodings which are identical to those of some non-architectural core events. This is not a problem in the general case as long as there's an 1:1 mapping between the two, but there are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P. CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure reference (i.e. bus) but TSC, so there's the need to disambiguate. In order to avoid the namespace collision rename all the architectural events in a way they cannot be ambigous and refactor the architectural events handling function to reflect this change. While here, per Jim Harris request, rename iap_architectural_event_is_unsupported() to iap_event_is_architectural().
Discussed with: jimharris Reviewed by: jimharris, gnn
|
250101 |
30-Apr-2013 |
davide |
Complete r250097: Do not change the initialization order in pmc_intel_initialize().
|
250097 |
30-Apr-2013 |
davide |
When hwpmc(4) module is unloaded it reports a double leakage. This happens at least if FreeBSD is ran under VirtualBox. In order to avoid the leakage, properly deallocate structures in case CPU claims that hw performance monitoring counters are not supported.
Reported by: hiren
|
250096 |
30-Apr-2013 |
davide |
Fixup Westmere hwpmc(4) support: add missing CPU flag so that intrucion-retired, llc-misses and llc-reference events can now be allocated.
Reviewed by: jimharris, gnn
|
249460 |
14-Apr-2013 |
hiren |
Improve/correct a comment. We now support a lot more cpu types.
PR: kern/177496 Approved by: sbruno (mentor)
|
249428 |
12-Apr-2013 |
rstone |
Cosmetic change: make a comment reference Sandy Bridge *Xeon*
Reviewed by: sbruno MFC after: 1 week
|
249069 |
03-Apr-2013 |
sbruno |
Trailing whitespace cleanup along with 80 column enforcemnt.
Submitted by: hiren.panchasara@gmail.com Reviewed by: sbruno@freebsd.org Obtained from: Yahoo! Inc. MFC after: 2 weeks
|
248842 |
28-Mar-2013 |
sbruno |
Update hwpmc to support Haswell class processors. 0x3C: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this processor model.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Reviewed by: jimharris, sbruno Obtained from: Yahoo! Inc. MFC after: 2 weeks
|
248084 |
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
247836 |
05-Mar-2013 |
fabient |
Add a generic way to call per event allocate / release function.
Reviewed by: mav MFC after: 1 month
|
247329 |
26-Feb-2013 |
mav |
Add support for good old 8192Hz profiling clock to software PMC.
Reviewed by: fabient
|
247318 |
26-Feb-2013 |
mav |
Change the way how software PMC updates counters. This at least fixes -n option of pmcstat.
Reviewed by: fabient
|
246166 |
31-Jan-2013 |
sbruno |
Update hwpmc to support the Xeon class of Ivybridge processors. case 0x3E: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Reviewed by: davide, jimharris, sbruno Obtained from: Yahoo! Inc. MFC after: 2 weeks
|
245339 |
12-Jan-2013 |
sbruno |
Quiesce a couple of clang warnings
Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Obtained from: Yahoo! Inc
|
242361 |
30-Oct-2012 |
attilio |
Fixup r240246: hwpmc needs to retain the pinning until ASTs are not executed. This means past the point where userret() is generally executed.
Skip the td_pinned check if a callchain tracing is currently happening and add a more robust check to pmc_capture_user_callchain() in order to catch td_pinned leak past ast() in hwpmc case.
Reported and tested by: fabient MFC after: 1 week X-MFC: r240246
|
241974 |
24-Oct-2012 |
sbruno |
Cleanup and rename some variables in libpmc and hwpmc.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Reviewed by: jimharris@ sbruno@ Obtained from: Yahoo! Inc. MFC after: 2 weeks
|
241896 |
22-Oct-2012 |
kib |
Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes.
Conducted and reviewed by: attilio Tested by: pho
|
241738 |
19-Oct-2012 |
sbruno |
Update hwpmc to support the Xeon class of Sandybridge processors. (Model 0x2D /* Per Intel document 253669-044US 08/2012. */)
Add manpage to document all the goodness that is available in this processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Reviewed by: jimharris@ fabient@ Obtained from: Yahoo! Inc. MFC after: 2 weeks
|
240650 |
18-Sep-2012 |
avg |
hwpmc amd_pcpu_fini: fix a bug in code locked under DEBUG
MFC after: 16 days
|
240475 |
13-Sep-2012 |
attilio |
Remove all the checks on curthread != NULL with the exception of some MD trap checks (eg. printtrap()).
Generally this check is not needed anymore, as there is not a legitimate case where curthread != NULL, after pcpu 0 area has been properly initialized.
Reviewed by: bde, jhb MFC after: 1 week
|
240203 |
07-Sep-2012 |
fabient |
Complete and merge the list between Sandy/Ivy bridge of events that can run on specific PMC.
MFC after: 1 month
|
240164 |
06-Sep-2012 |
fabient |
Add Intel Ivy Bridge support to hwpmc(9). Update offcore RSP token for Sandy Bridge. Note: No uncore support.
Will works on Family 6 Model 3a.
MFC after: 1 month Tested by: bapt, grehan
|
237196 |
17-Jun-2012 |
davide |
Disable hwpmc(4) support for Intel Xeon Sandy Bridge (Model 0x2D). Due to some differences in MSRs between Xeon Sandy Bridge and Core Sandy Bridge (Model 0x2A), wrmsr() may generate in a GP# fault exception and so a panic of the machine.
Approved by: gnn (mentor) MFC after: 3 days
|
236997 |
13-Jun-2012 |
fabient |
Add ARM callchain support for hwpmc.
Sponsored by: NETASQ MFC after: 3 days
|
235831 |
23-May-2012 |
fabient |
Soft PMC support for ARM. Callgraph is not captured, only current location.
Sample system wide profiling: "pmcstat -Sclock.hard -T"
|
235229 |
10-May-2012 |
fabient |
Remove out of date KASSERT that fire with soft PMC.
MFC after: 1 week
|
234930 |
02-May-2012 |
gnn |
Fix so that ,usr and ,os work correctly with fixed function (IAF) counters.
MFC after: 1 week
|
234598 |
23-Apr-2012 |
fabient |
Fix class malloc init for mips and powerpc that was not converted by r233628.
Found by: monthadar, adrian MFC after: 1 week
|
233628 |
28-Mar-2012 |
fabient |
Add software PMC support.
New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions.
Sponsored by: NETASQ MFC after: 1 month
|
233569 |
27-Mar-2012 |
gonzo |
Fix crash on VirtualBox (and probably on some real hardware):
- Do not cover error returned by pmc_core_initialize with the result of pmc_uncore_initialize, fail right away. - Give a user something to report instead failing silently
Reported by: Alexandr Kovalenko <never@nevermind.kiev.ua>
|
233544 |
27-Mar-2012 |
fabient |
Fix random deadlock on pmcstat exit: - Exit the thread when soft shutdown is requested - Wakeup owner thread.
Reproduced/tested by looping pmcstat measurement: pmcstat -S instructions -O/tmp/test ls
MFC after: 1 week
|
233334 |
23-Mar-2012 |
gonzo |
Add Octeon PMC hardware backend
|
233333 |
23-Mar-2012 |
gonzo |
Add list of Octeon's PMC counters obtained from cvmx-core.h
|
233319 |
22-Mar-2012 |
gonzo |
Rework MIPS PMC code:
- Replace MIPS24K-specific code with more generic framework that will make adding new CPU support easier - Add MIPS24K support for new framework - Limit backtrace depth to 1 for stability reasons and add option HWPMC_MIPS_BACKTRACE to override this limitation
|
232992 |
14-Mar-2012 |
gonzo |
- Remove unncessary type casts - Make kernel backtrace routine more robust by refusing to backtrace further when encountered function that is possibly modifies SP value
|
232869 |
12-Mar-2012 |
adrian |
This header file no longer exists when doing cross builds, so remove it.
mips24k hwpmc now compiles again.
|
232846 |
12-Mar-2012 |
gonzo |
Implement pmc_save_user_callchain and pmc_save_kernel_callchain for MIPS
|
232612 |
06-Mar-2012 |
gnn |
Properly mask off bits that are not supported in the IAP counters. This fixes a bug where users would see massively large counts, near to 2**64 -1, due to the bits not being cleared.
MFC after: 3 weeks
|
232366 |
01-Mar-2012 |
davide |
- Add support for the Intel Sandy Bridge microarchitecture (both core and uncore counting events) - New manpages with event lists. - Add MSRs for the Intel Sandy Bridge microarchitecture
Reviewed by: attilio, brueffer, fabient Approved by: gnn (mentor) MFC after: 3 weeks
|
230636 |
28-Jan-2012 |
emaste |
pmc_*_initialize may return NULL if the CPU is not supported, so check that md is not null before dereferencing it.
PR: kern/156540
|
230063 |
13-Jan-2012 |
gnn |
Clean up a switch statement for uncore events on Westmere processors.
Submitted by: Davide Italiano Reviewed by: gnn MFC after: 1 week
|
229470 |
04-Jan-2012 |
fabient |
Update PMC events from October 2011 Intel documentation.
Submitted by: Davide Italiano <davide.italiano@gmail.com> MFC after: 3 days
|
229469 |
04-Jan-2012 |
fabient |
Add missing MSR programming for some events.
Submitted by: Davide Italiano <davide.italiano@gmail.com> MFC after: 3 days
|
229076 |
31-Dec-2011 |
dim |
In sys/dev/hwpmc/hwpmc_amd.c, fix a clang warning about invalid enum conversions.
Reviewed by: jkoshy MFC after: 1 week
|
228874 |
25-Dec-2011 |
bz |
Quite the tinderbox for the holidays. Remove the assert[1].
Suggested by: jhibbits [1] MFC after: 3 days
|
228869 |
24-Dec-2011 |
jhibbits |
Implement hwpmc counting PMC support for PowerPC G4+ (MPC745x/MPC744x). Sampling is in progress.
Approved by: nwhitehorn (mentor) MFC after: 9.0-RELEASE
|
228787 |
21-Dec-2011 |
eadler |
- Remove extra space
Submitted by: Davide Italiano <davide.italiano@gmail.com> Approved by: brucec
|
228438 |
12-Dec-2011 |
fabient |
There's a small set of events on Nehalem, that are not supported in processors with CPUID signature 06_1AH, 06_1EH, and 06_1FH.
Refuse to allocate them on unsupported model.
Submitted by: Davide Italiano <davide.italiano@gmail.com> MFC after: 1 month
|
228198 |
02-Dec-2011 |
fabient |
Update Westmere uncore event exception list.
Submitted by: Davide Italiano <davide italiano at gmail com> MFC after: 1 week
|
227395 |
09-Nov-2011 |
adrian |
Flip on processing interrupt profile events for mips24k.
This is a bit hackish and should be made more generic (ie, support more than two hard-coded performance counter+config register pairs) so it can be used for mips74k and other chips.
All this does is process the initial interrupt event. It doesn't (yet) handle callgraph events, so even if you route the exception/interrupt to this routine and flip the bit on, it will hang and crash pmc unless you disable callgraph support when you enable a sample based PMC.
|
226514 |
18-Oct-2011 |
fabient |
Add a flush of the current PMC log buffer before displaying the next top.
As the underlying block is 4KB if the PMC throughput is low the measurement will be reported on the next tick. pmcstat(8) use the modified flush API to reclaim current buffer before displaying next top.
MFC after: 1 month
|
226091 |
07-Oct-2011 |
adrian |
Begin implementing correct MIPS24K sampling mode behaviour.
* Add the interrupt bit in the configuration register * Correctly set the counter register for the sampling overflow interrupt. The interrupt is asserted when bit 31 is set. So set the overflow value at 0x80000000 and subtract the programmed value as appropriate.
|
225617 |
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
224778 |
11-Aug-2011 |
rwatson |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent.
Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
222813 |
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
222002 |
16-May-2011 |
attilio |
Merge r221279,221280 from largeSMP project: pmc_mask doesn't need to use memory barriers.
Reviewed by: fabient Tested by: several MFC after: 1 week
|
213409 |
04-Oct-2010 |
gnn |
Fix two aliases that had the same name but were pointing to different events. These are now disamiguated.
MFC after: 1 week
|
212224 |
05-Sep-2010 |
fabient |
Fix invalid class removal when IAF is not the last class. Keep IAF class with 0 PMC and change the alias in libpmc to IAP.
MFC after: 1 week
|
210621 |
29-Jul-2010 |
gnn |
Make sure that we clear the correct bits when we turn off a PMC. It was possible that we could have turned a bit on but never cleared it.
Extend the calls to rdmsr() to all necessary functions, not just those which previously caused a panic.
Pointed out by: jhb@ MFC after: 1 week
|
210012 |
13-Jul-2010 |
gnn |
Fix a panic brought about by writing an MSR without a proper mask. All of the necessary wrmsr calls are now preceded by a rdmsr and we leave the reserved bits alone. Document the bits in the relevant registers for future reference.
Tested by: mdf MFC after: 1 week
|
208861 |
05-Jun-2010 |
fabient |
Convert pm_runcount to int to correctly check for negative value. Remove uncessary check for error.
Found with: Coverity Prevent(tm) MFC after: 1 month
|
207484 |
01-May-2010 |
rstone |
When configuring a system-wide couting PMC, hwpmc was incorrectly logging process mappings for that PMC. Nothing ever reads pmc logs out of a counting PMC, so the log buffers were leaked when the PMC was deconfigured. The process mappings are only useful for sampling PMCs anyway, so only log the mappings if the PMC is a sampling PMC.
This bug would cause allocating sample-mode PMCs to fail with ENOMEM after allocating several counting-mode PMCs.
Approved by: jkoshy (mentor) MFC after: 2 weeks
|
206684 |
15-Apr-2010 |
fabient |
- Fix a typo OFFCORE_REQUESTS.ANY.RFO is B0H10H and not 80H10H. - Enable missing PARTIAL_ADDRESS_ALIAS for Core i7.
MFC after: 3 days
|
206089 |
02-Apr-2010 |
fabient |
- Support for uncore counting events: one fixed PMC with the uncore domain clock, 8 programmable PMC. - Westmere based CPU (Xeon 5600, Corei7 980X) support. - New man pages with events list for core and uncore. - Updated Corei7 events with Intel 253669-033US December 2009 doc. There is some removed events in the documentation, they have been kept in the code but documented in the man page as obsolete. - Offcore response events can be setup with rsp token.
Sponsored by: NETASQ
|
205998 |
31-Mar-2010 |
fabient |
If there is multiple PMCs for the same interrupt ignore new post. This will indirectly fix a bug where the thread will be pinned forever if the assert is not compiled.
MFC after: 3days
|
205694 |
26-Mar-2010 |
fabient |
Handling SIGPIPE will cause deadlock/crash. Return an error immediatly in case of hard shutdown.
MFC after: 3days
|
204878 |
08-Mar-2010 |
fabient |
Change the way shutdown is handled for log file.
pmc_flush_logfile is now non-blocking and just ask the kernel to shutdown the file. From that point, no more data is accepted by the log thread and when the last buffer is flushed the file is closed.
This will remove a deadlock between pmcstat asking for flush while it cannot flush the pipe itself.
MFC after: 3 days
|
204635 |
03-Mar-2010 |
gnn |
Add support for hwpmc(4) on the MIPS 24K, 32 bit, embedded processor.
Add macros for properly accessing coprocessor 0 registers that support performance counters.
Reviewed by: jkoshy rpaulo fabien imp MFC after: 1 month
|
201151 |
29-Dec-2009 |
jkoshy |
Use VFS_{LOCK,UNLOCK}_GIANT() around the call to vrele().
Reviewed by: kib
|
201023 |
26-Dec-2009 |
jkoshy |
* Support the L1D_CACHE_LD event on Core2 processors. * Correct a group of typos: for Core2 programmable events, check user supplied umask values against the correct event descriptor field.
Submitted by: Ryan Stone <rysto32 at gmail dot com>
|
201021 |
26-Dec-2009 |
jkoshy |
Log process mappings for existing processes at PMC start time.
Submitted by: Marc Unangst <mju at panasas dot com> [original patch] Tested by: fabient
|
200928 |
23-Dec-2009 |
rpaulo |
Intel XScale hwpmc(4) support.
This brings hwpmc(4) support for 2nd and 3rd generation XScale cores. Right now it's enabled by default to make sure we test this a bit. When the time comes it can be disabled by default. Tested on Gateworks boards.
A man page is coming.
Obtained from: //depot/user/rpaulo/xscalepmc/...
|
200669 |
18-Dec-2009 |
jkoshy |
Recognize Intel CPUs with Family 0x6, Models 0x1E and 0x1F.
Submitted by: Marc Unangst <mju at panasas dot com>
|
200060 |
03-Dec-2009 |
jkoshy |
Use a better check for a valid kernel stack address when capturing kernel call chains.
Submitted by: Mark Unangst <mju at panasas.com> Tested by: fabient
|
200001 |
01-Dec-2009 |
emaste |
Fix parenthesis typo -- copy full frame pointer for userland callchain, not just one byte.
Submitted by: Ryan Stone rysto32 at gmail dot com
|
199972 |
30-Nov-2009 |
emaste |
Use switch out (SWO) instead of switch in (SWI) debug log mask in csw_out.
|
199763 |
24-Nov-2009 |
fabient |
- fix a LOR between process lock and pmc thread mutex - fix a system deadlock on process exit when the sample buffer is full (pmclog_loop blocked in fo_write) and pmcstat exit.
Reviewed by: jkoshy MFC after: 3 weeks
|
198432 |
24-Oct-2009 |
jkoshy |
Only claim that the PMC_CLASS_IAF PMCs are supported by a CPU if there are PMCs on the CPU that belong to the class.
Review and testing by: fabient
|
198343 |
21-Oct-2009 |
fabient |
Handle the case where there is only one PMC in the system.
Approved by: jkoshy (mentor) MFC after: 3 days
|
198204 |
18-Oct-2009 |
rpaulo |
Fix KASSERT string to include the real module name.
|
197412 |
22-Sep-2009 |
rpaulo |
Reserve events for XScale.
Reviewed by: jkoshy, gnn MFC after: 1 week
|
196739 |
01-Sep-2009 |
gnn |
Add counters for the i7 architecture which were accidentally left out of the original commit of i7 support. These are all the counters on pages A-32 and A-33 of the _Intel(R) 64 and IA32 Architectures Software Developer's Manual Vol 3B_, June 2009. Almost all of these counters relate to operations on the L2 cache.
Reviewed by: jkoshy MFC after: 1 month
|
196224 |
14-Aug-2009 |
jhb |
Adjust the handling of the local APIC PMC interrupt vector: - Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc() routines in the local APIC code that the hwpmc(4) driver can use to manage the local APIC PMC interrupt vector. - Do not enable the local APIC PMC interrupt vector by default when HWPMC_HOOKS is enabled. Instead, the hwpmc(4) driver explicitly enables the interrupt when it is succesfully initialized and disables the interrupt when it is unloaded. This avoids enabling the interrupt on unsupported CPUs which may result in spurious NMIs.
Reported by: rnoland Reviewed by: jkoshy Approved by: re (kib) MFC after: 2 weeks
|
195005 |
25-Jun-2009 |
attilio |
Fix a LOR between pmc_sx and proctree/allproc when creating a new thread for the pmclog.
Reported by: Ryan Stone <rstone at sandvine dot com> Tested by: Ryan Stone <rstone at sandvine dot com> Sponsored by: Sandvine Incorporated
|
187761 |
27-Jan-2009 |
jeff |
- Add support for nehalem/corei7 cpus. This supports all of the core counters defined in the reference manual. It does not support the 'uncore' events.
Reviewed by: jkoshy Sponsored by: Nokia
|
186177 |
16-Dec-2008 |
jkoshy |
Bug fixes: - Initialize variables before use. - Remove a KASSERT() that could falsely trigger if there are other sources of NMIs in the system.
Efficiency tweak: - When checking PMCs that overflowed, ignore PMCs that were not configured for sampling.
|
186127 |
15-Dec-2008 |
jkoshy |
- Disambiguate a few panic messages. - Style fixes: wrap long lines, parenthesize return values.
|
186037 |
13-Dec-2008 |
jkoshy |
- Bug fix: prevent a thread from migrating between CPUs between the time it is marked for user space callchain capture in the NMI handler and the time the callchain capture callback runs.
- Improve code and control flow clarity by invoking hwpmc(4)'s user space callchain capture callback directly from low-level code.
Reviewed by: jhb (kern/subr_trap.c) Testing (various patch revisions): gnn, Fabien Thomas <fabien dot thomas at netasq dot com>, Artem Belevich <artemb at gmail dot com>
|
185585 |
03-Dec-2008 |
jkoshy |
Fixes for Core2 Extreme support.
Submitted by: "Artem Belevich" <artemb at gmail dot com>
|
185582 |
03-Dec-2008 |
jkoshy |
Add aliases that map architectural event names to fixed function counters.
|
185555 |
02-Dec-2008 |
jkoshy |
- Efficiency tweak: when checking for PMC overflows, only go to hardware for PMCs that have been configured for sampling.
- Bug fix: acknowledge PMC hardware overflows irrespective of the the (software) PMC's state.
|
185465 |
30-Nov-2008 |
jkoshy |
Improve a comment.
|
185363 |
27-Nov-2008 |
jkoshy |
- Add support for PMCs in Intel CPUs of Family 6, model 0xE (Core Solo and Core Duo), models 0xF (Core2), model 0x17 (Core2Extreme) and model 0x1C (Atom).
In these CPUs, the actual numbers, kinds and widths of PMCs present need to queried at run time. Support for specific "architectural" events also needs to be queried at run time.
Model 0xE CPUs support programmable PMCs, subsequent CPUs additionally support "fixed-function" counters.
- Use event names that are close to vendor documentation, taking in account that: - events with identical semantics on two or more CPUs in this family can have differing names in vendor documentation, - identical vendor event names may map to differing events across CPUs, - each type of CPU supports a different subset of measurable events.
Fixed-function and programmable counters both use the same vendor names for events. The use of a class name prefix ("iaf-" or "iap-" respectively) permits these to be distinguished.
- In libpmc, refactor pmc_name_of_event() into a public interface and an internal helper function, for use by log handling code.
- Minor code tweaks: staticize a global, freshen a few comments.
Tested by: gnn
|
185341 |
26-Nov-2008 |
jkim |
Introduce cpu_vendor_id and replace a lot of strcmp(cpu_vendor, "...").
Reviewed by: jhb, peter (early amd64 version)
|
185168 |
22-Nov-2008 |
jkoshy |
Unbreak LINT.
|
184997 |
16-Nov-2008 |
jkoshy |
Print PMC widths in the initialization announcement.
|
184994 |
15-Nov-2008 |
jkoshy |
Correct an oversight: call the MD finalize hook at module unload time.
|
184993 |
15-Nov-2008 |
jkoshy |
Fix assertions.
Reported by: keramida
|
184992 |
15-Nov-2008 |
jkoshy |
Correct an indexing error (a change missed out in #184802).
|
184802 |
09-Nov-2008 |
jkoshy |
- Separate PMC class dependent code from other kinds of machine dependencies. A 'struct pmc_classdep' structure describes operations on PMCs; 'struct pmc_mdep' contains one or more 'struct pmc_classdep' structures depending on the CPU in question.
Inside PMC class dependent code, row indices are relative to the PMCs supported by the PMC class; MI code in "hwpmc_mod.c" translates global row indices before invoking class dependent operations.
- Augment the OP_GETCPUINFO request with the number of PMCs present in a PMC class.
- Move code common to Intel CPUs to file "hwpmc_intel.c".
- Move TSC handling to file "hwpmc_tsc.c".
|
184801 |
09-Nov-2008 |
jkoshy |
Style tweak.
|
184652 |
04-Nov-2008 |
jhb |
Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed.
In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2).
For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock.
MFC after: 1 month
|
184214 |
23-Oct-2008 |
des |
Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
|
184205 |
23-Oct-2008 |
des |
Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after: 3 months
|
183725 |
09-Oct-2008 |
jkoshy |
- Sparsely number enumerations 'pmc_cputype' and 'pmc_event' in order to reduce ABI disruptions when new cpu types and new PMC events are added in the future. - Support alternate spellings for PMC events. Derive the canonical spelling of an event name from its enumeration name in 'enum pmc_event'. - Provide a way for users to disambiguate between identically named events supported by multiple classes of PMCs in a CPU. - Change libpmc's machine-dependent event specifier parsing code to better support CPUs containing two or more classes of PMC resources.
|
183717 |
09-Oct-2008 |
jkoshy |
Rework pmc-dependent flag handling.
|
183641 |
06-Oct-2008 |
jkoshy |
Correct a typo.
|
183588 |
04-Oct-2008 |
jkoshy |
Fix a typo.
|
183535 |
02-Oct-2008 |
jkoshy |
Correct misspellings.
|
183266 |
22-Sep-2008 |
jkoshy |
Support sparsely numbered CPUs.
Requested by: obrien, alfred (long ago)
|
183033 |
15-Sep-2008 |
jkoshy |
Correct a callchain capture bug on the i386.
On the i386 architecture, the processor only saves the current value of `%esp' on stack if a privilege switch is necessary when entering the interrupt handler. Thus, `frame->tf_esp' is only valid for an entry from user mode. For interrupts taken in kernel mode, we need to determine the top-of-stack for the interrupted kernel procedure by adding the appropriate offset to the current frame pointer.
Reported by: kris, Fabien Thomas Tested by: Fabien Thomas <fabien.thomas at netasq dot com>
|
180794 |
25-Jul-2008 |
jeff |
- Provide kernelname as the name for process with P_KTHREAD set as otherwise their textvp is NULL.
Reviewed by: jkoshy Sponsored by: Nokia
|
177344 |
18-Mar-2008 |
adrian |
Sign-extend the 48-bit AMD PMC counter before treating it to a 64-bit 2's compliment.
The 2's compliment transform is done so a "count down" sampling interval can be converted into a "count up" PMC value. a 2's complimented 'count down' value is written to the PMC counter; then the read-back counter is reverted via another 2's compliment.
PR: kern/121660 Reviewed by: jkoshy Approved by: jkoshy MFC after: 1 week
|
177343 |
18-Mar-2008 |
adrian |
Fix the debugging output - the '0x' was duplicated from the %p option.
|
177161 |
14-Mar-2008 |
jkoshy |
Correct a typo.
|
175294 |
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
175202 |
10-Jan-2008 |
attilio |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
174410 |
07-Dec-2007 |
jkoshy |
Add stub functions to unbreak LINT.
|
174395 |
07-Dec-2007 |
jkoshy |
Kernel and hwpmc(4) support for callchain capture.
Sponsored by: FreeBSD Foundation and Google Inc.
|
174071 |
29-Nov-2007 |
jkoshy |
Revert revision 1.4.
Intel CPUs with family 0x6, model 0xE and later (i.e., Intel Core(TM)) have a PMC architecture that differs somewhat from previous CPUs in family 0x6. Even though the basic programming model is similar, the documented set of legal values that may be loaded into their PMC MSRs differs from that of the previous PMCs in family 0x6 and reusing bit values valid for the older PMCs could result in undefined behaviour in the general case.
|
172836 |
20-Oct-2007 |
julian |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
168856 |
19-Apr-2007 |
jkoshy |
Fix witness(4) warnings about mutex use.
Group mutexes used in hwpmc(4) into 3 "types" in the sense of witness(4):
- leaf spin mutexes---only one of these should be held at a time, so these mutexes are specified as belonging to a single witness type "pmc-leaf".
- `struct pmc_owner' descriptors are protected by a spin mutex of witness type "pmc-owner-proc". Since we call wakeup_one() while holding these mutexes, the witness type of these mutexes needs to dominate that of "sleepq chain" mutexes.
- logger threads use a sleep mutex, of type "pmc-sleep".
Submitted by: wkoszek (earlier patch)
|
167086 |
27-Feb-2007 |
jhb |
Use pause() rather than tsleep() on stack variables and function pointers.
|
164033 |
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
162383 |
17-Sep-2006 |
rwatson |
Declare security and security.bsd sysctl hierarchies in sysctl.h along with other commonly used sysctl name spaces, rather than declaring them all over the place.
MFC after: 1 month Sponsored by: nCircle Network Security, Inc.
|
158458 |
11-May-2006 |
jhb |
First pass at removing Alpha kernel support.
|
157815 |
17-Apr-2006 |
jhb |
Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway.
The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.
|
157651 |
11-Apr-2006 |
jkoshy |
Fix a cut-n-paste bug that crept in.
Reported by: "Pawel Worach" pawel.worach at gmail.com
|
157454 |
04-Apr-2006 |
ps |
Add support for Intel cpu model's 5 & 6.
Approved by: jkoshy
|
157210 |
28-Mar-2006 |
jkoshy |
Forcibly turn off all PMCs at module unload time.
MFC after: 1 week
|
157144 |
26-Mar-2006 |
jkoshy |
MFP4: Support for profiling dynamically loaded objects.
Kernel changes:
Inform hwpmc of executable objects brought into the system by kldload() and mmap(), and of their removal by kldunload() and munmap(). A helper function linker_hwpmc_list_objects() has been added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve the list of currently loaded kernel modules.
The unused `MAPPINGCHANGE' event has been deprecated in favour of separate `MAP_IN' and `MAP_OUT' events; this change reduces space wastage in the log.
Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to handle the map change callbacks.
Change the default per-cpu sample buffer size to hold 32 samples (up from 16).
Increment __FreeBSD_version.
libpmc(3) changes:
Update libpmc(3) to deal with the new events in the log file; bring the pmclog(3) manual page in sync with the code.
pmcstat(8) changes:
Introduce new options to pmcstat(8): "-r" (root fs path), "-M" (mapfile name), "-q"/"-v" (verbosity control). Option "-k" now takes a kernel directory as its argument but will also work with the older invocation syntax.
Rework string handling in pmcstat(8) to use an opaque type for interned strings. Clean up ELF parsing code and add support for tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).
Report statistics at the end of a log conversion run depending on the requested verbosity level.
Reviewed by: jhb, dds (kernel parts of an earlier patch) Tested by: gallatin (earlier patch)
|
156834 |
18-Mar-2006 |
jkoshy |
When deconfiguring a log, only stop PMCs that are in the RUNNING state.
|
156778 |
16-Mar-2006 |
jkoshy |
When compiled with -DDEBUG, only print the old value of a PMC in a debugging message if the flag PMC_F_OLDVALUE was specified in the PMC_OP_RW request being acted upon. This should fix Coverity bug CID 671.
Found by: Coverity Prevent MFC after: 3 weeks
|
156466 |
09-Mar-2006 |
jkoshy |
When a process is de-configuring a log file, also stop all of its PMCs that require a log file to operate. This change should fix PR 90269.
PR: kern/90269 MFC after: 1 week
|
154483 |
17-Jan-2006 |
jkoshy |
Fix a memory leak.
Found by: Coverity
|
153735 |
26-Dec-2005 |
jkoshy |
- Plug a memory leak: free up per-cpu sample buffers at module unload time. - Correct a few style nits.
|
153728 |
26-Dec-2005 |
jkoshy |
Wrap comment lines to be under 80 characters wide.
MFC after: 3 days
|
153110 |
05-Dec-2005 |
ru |
Fix -Wundef warnings found when compiling i386 LINT, GENERIC and custom kernels.
|
152584 |
18-Nov-2005 |
ps |
Add support for a new/unreleased Pentium-M.
Reviewed by: jkoshy
|
151205 |
10-Oct-2005 |
jkoshy |
Bug fix initialization on multi-core HTT CPUs.
Reported by: ps Tested by: ps
|
150050 |
12-Sep-2005 |
jkoshy |
Process one NMI interrupt per handler invocation as the processor 'buffers' pending NMIs from multiple interrupting PMCs and delivers them serially.
Reported by: Olivier Crameri <olivier.crameri@epfl.ch> MFC after: 3 days
|
149527 |
27-Aug-2005 |
jkoshy |
Re-enable sampling on the AMD64.
|
149375 |
22-Aug-2005 |
jkoshy |
On x86 processors, turn off any 'INTERRUPT' capabilities on PMCs if the CPU does not have its local APIC enabled.
MFC after: 3 days
|
149374 |
22-Aug-2005 |
jkoshy |
Return EOPNOTSUPP instead of EINVAL if a PMC allocation request specifies a PMC capability (e.g., sampling) that is not supported by hardware. Return EINVAL early if the PMC class passed in is not recognized.
MFC after: 3 days
|
149373 |
22-Aug-2005 |
jkoshy |
Print PMC capabilities at module load time.
MFC after: 3 days
|
149360 |
22-Aug-2005 |
jkoshy |
Turn off sampling modes on the AMD64 till the time I can track down the reason for the double fault seen when sampling under load.
MFC after: 3 days
|
148562 |
30-Jul-2005 |
jkoshy |
Fail the module loading process if the currently executing kernel was not compiled with 'options HWPMC_HOOKS' or if the compiled-in version numbers of the kernel and module are out of sync.
Reported by: cracauer MFC after: 3 days
|
148088 |
17-Jul-2005 |
jkoshy |
Use LK_CANRECURSE since when a PMC-owning process performs an exec, the new text vnode is already locked by itself.
MFC after: 3 days
|
148067 |
15-Jul-2005 |
jhb |
Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct.
MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm
|
147989 |
14-Jul-2005 |
jkoshy |
Fix breakage introduced in rev 1.7.
MFC after: 3 days
|
147867 |
09-Jul-2005 |
jkoshy |
sys/dev/hwpmc/hwpmc_{amd,piv,ppro}.c: - Update driver interrupt statistics correctly.
sys/sys/pmc.h, sys/dev/hwpmc/hwpmc_mod.c: - Fix a bug affecting debug printfs. - Move the 'stalled' flag from being in a bit in the 'pm_flags' field of a 'struct pmc' to a field of its own in the same structure. This flag is updated from the NMI handler and keeping it separate makes it easier to avoid races with other parts of the code.
sys/dev/hwpmc/hwpmc_logging.c: - Do arithmetic with 'uintptr_t' types rather that casting to and from 'char *'.
Approved by: re (scottl)
|
147759 |
03-Jul-2005 |
jkoshy |
- Update the CPU version check to recognize P4/EMT64 CPUs. [1] - Allow libpmc(3) to support P4/EMT64 PMCs on the amd64 architecture and AMD K8 PMCs on the i386. [2]
Submitted by: ps [1] Pointy hat: myself [2] Approved by: re (scottl)
|
147708 |
30-Jun-2005 |
jkoshy |
MFP4:
- pmcstat(8) gprof output mode fixes:
lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h: + Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event + Add an 'entryaddr' field to the PMCLOG_PROCEXEC event, so that pmcstat(8) can determine where the runtime loader /libexec/ld-elf.so.1 is getting loaded.
sys/kern/kern_exec.c: + Use a local struct to group the entry address of the image being exec()'ed and the process credential changed flag to the exec handling hook inside hwpmc(4).
usr.sbin/pmcstat/*: + Support "-k kernelpath", "-D sampledir". + Implement the ELF bits of 'gmon.out' profile generation in a new file "pmcstat_log.c". Move all log related functions to this file. + Move local definitions and prototypes to "pmcstat.h"
- Other bug fixes: + lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read(). + sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all attached PMCs when a process exits. + sys/sys/pmc.h: correct a function prototype. + Improve usage checks in pmcstat(8).
Approved by: re (blanket hwpmc)
|
147510 |
21-Jun-2005 |
jkoshy |
Fix a -Wuninitialized warning reported by rwatson.
Approved by: re (blanket hwpmc)
|
147191 |
09-Jun-2005 |
jkoshy |
MFP4:
- Implement sampling modes and logging support in hwpmc(4).
- Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code.
- New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file).
- pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events.
- bug fixes & documentation.
|
146799 |
30-May-2005 |
jkoshy |
Kernel hooks to support PMC sampling modes.
Reviewed by: alc
|
145774 |
01-May-2005 |
jkoshy |
Add convenience APIs pmc_width() and pmc_capabilities() to -lpmc. Have pmcstat(8) and pmccontrol(8) use these APIs.
Return PMC class-related constants (PMC widths and capabilities) with the OP GETCPUINFO call leaving OP PMCINFO to return only the dynamic information associated with a PMC (i.e., whether enabled, owner pid, reload count etc.).
Allow pmc_read() (i.e., OPS PMCRW) on active self-attached PMCs to get upto-date values from hardware since we can guarantee that the hardware is running the correct PMC at the time of the call.
Bug fixes: - (x86 class processors) Fix a bug that prevented an RDPMC instruction from being recognized as permitted till after the attached process had context switched out and back in again after a pmc_start() call.
Tighten the rules for using RDPMC class instructions: a GETMSR OP is now allowed only after an OP ATTACH has been done by the PMC's owner to itself. OP GETMSR is not allowed for PMCs that track descendants, for PMCs attached to processes other than their owner processes.
- (P4/HTT processors only) Fix a bug that caused the MI and MD layers to get out of sync. Add a new MD operation 'get_config()' as part of this fix.
- Allow multiple system-mode PMCs at the same row-index but on different CPUs to be allocated.
- Reject allocation of an administratively disabled PMC.
Misc. code cleanups and refactoring. Improve a few comments.
|
145615 |
28-Apr-2005 |
jkoshy |
Return the correct register number in the 'get_msr()' MD function.
Only allow a process to use the x86 RDPMC instruction if it has allocated and attached a PMC to itself.
Inform the MD layer of the "pseudo context switch out" that needs to be done when the last thread of a process is exiting.
|
145338 |
20-Apr-2005 |
marcel |
Include <sys/pmc.h> instead of <machine/pmc_mdep.h>. The MI header includes the MD header for us. Do not include <machine/specialreg.h> as it is not a header file that can be included from MI files. It is included from <machine/pmc_mdep.h> if so needed and possible.
Ok'd: jkoshy@
|
145313 |
20-Apr-2005 |
jkoshy |
Remove dead variable.
|
145303 |
19-Apr-2005 |
imp |
Remove unused variable that was horking up the LINT build
|
145301 |
19-Apr-2005 |
imp |
Minimal changes to get this to compile with -DDEBUG defined as well as hack a couple used before set warnings for LINT happiness.
|
145256 |
19-Apr-2005 |
jkoshy |
Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities and documentation into -CURRENT.
Bump FreeBSD_version.
Reviewed by: alc, jhb (kernel changes)
|