345868 |
04-Apr-2019 |
markj |
MFC r345359, r345384: Don't attempt to measure TSC skew when running as a VM guest.
PR: 218452 |
328386 |
25-Jan-2018 |
pkelsey |
MFC r316648:
Corrected misspelled versions of rendezvous.
The MFC maintains smp_no_rendevous_barrier() as a symbol alias of smp_no_rendezvous_barrier().
__FreeBSD_version bumped to indicate presence of the new name smp_no_rendezvous_barrier().
Reviewed by: gnn, jhb (email), kib Differential Revision: https://reviews.freebsd.org/D10313 |
327492 |
02-Jan-2018 |
markj |
MFC r326935: Avoid CPU migration in dtrace_gethrtime() on x86. |
315011 |
10-Mar-2017 |
markj |
MFC r313841, r313850: Prevent CPU migration when checking the DTrace nofault flag on x86. |
302408 |
08-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
299746 |
14-May-2016 |
jhb |
Add an EARLY_AP_STARTUP option to start APs earlier during boot.
Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot.
This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed.
This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs.
However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu.
Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case.
As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code.
These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).
PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix
|
298171 |
17-Apr-2016 |
markj |
Make the second argument of dtrace_invop() a trapframe pointer.
Currently this argument is a pointer into the stack which is used by FBT to fetch the first five probe arguments. On all non-x86 architectures it's simply the trapframe address, so this change has no functional impact. On amd64 it's a pointer into the trapframe such that stack[1 .. 5] gives the first five argument registers, which are deliberately grouped together in the amd64 trapframe definition.
A trapframe argument simplifies the invop handlers on !x86 and makes the x86 FBT invop handler easier to understand. Moreover, it allows for invop handlers that may want to modify the register set of the interrupted thread.
|
297770 |
10-Apr-2016 |
markj |
Initialize DTrace hrtimer frequency during SI_SUB_CPU on i386 and amd64.
This allows the hrtimer to be used earlier during boot. This is required for boot-time DTrace: anonymous enablings are created during SI_SUB_DTRACE_ANON, which runs before APs are started. In particular, the DTrace deadman timer requires that the hrtimer be functional.
MFC after: 2 weeks
|
296990 |
17-Mar-2016 |
markj |
Remove unused variables dtrace_in_probe and dtrace_in_probe_addr.
|
291057 |
19-Nov-2015 |
markj |
Fix a bug in the amd64 dtrace_getarg() implementation: when unwinding the stack, take into account the copy of rsi pushed between the breakpoint trapframe and the dtrace_invop frame. Prior to r287644, this was covered by the fact that sizeof(struct amd64_frame) was 24 rather than 16.
Reported by: smh
|
288361 |
29-Sep-2015 |
avg |
dtrace_getarg: remove stray return statement on amd64, powerpc
MFC after: 10 days
|
287644 |
11-Sep-2015 |
markj |
Remove the arg0 field from struct amd64_frame. Its existence was a bug, since on amd64 the first argument to a function is generally not on the stack.
Revert an old DTrace bug fix to some code that assumed that sizeof(struct amd64_frame) == 16.
Reviewed by: jhb, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3255
|
285643 |
16-Jul-2015 |
kib |
When checking for the valid value of the frame pointer, verify that it belongs to the kernel stack address range for the thread. Right now, code checks that new frame is not farther then KSTACK_PAGES pages from the current frame, which allows the address to point past the top of the stack.
Reviewed by: andrew, emaste, markj Differential revision: https://reviews.freebsd.org/D3108 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
283509 |
25-May-2015 |
markj |
Remove unused references to calltrap.
MFC after: 3 days
|
282744 |
10-May-2015 |
markj |
Remove some commented-out upstream code for handling traps from usermode DTrace probes. This handling is already done in trap() on i386 and amd64.
|
281916 |
24-Apr-2015 |
markj |
Fix DTrace's panic() action.
It would previously call into some unfinished Solaris compatibility code and return without actually calling panic(9). The compatibility code is unneeded, however, so just remove it and have dtrace_panic() call vpanic(9) directly.
Differential Revision: https://reviews.freebsd.org/D2349 Reviewed by: avg MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division
|
280834 |
30-Mar-2015 |
markj |
Import a missing piece of commit b8fac8e162eda7e98d from illumos-gate.
This adds an upper bound, dtrace_ustackdepth_max, to the number of frames traversed when computing the userland stack depth. Some programs - notably firefox - are otherwise able to trigger an infinite loop in dtrace_getustack_common(), causing a panic.
MFC after: 1 week
|
277300 |
17-Jan-2015 |
smh |
Mechanically convert cddl sun #ifdef's to illumos
Since the upstream for cddl code is now illumos not sun, mechanically convert all sun #ifdef's to illumos #ifdef's which have been used in all newer code for some time.
Also do a manual pass to correct the use if #ifdef comments as per style(9) as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
MFC after: 1 month Sponsored by: Multiplay
|
276142 |
23-Dec-2014 |
markj |
Restore the trap type argument to the DTrace trap hook, removed in r268600. It's redundant at the moment since it can be obtained from the trapframe on the architectures where DTrace is supported, but this won't be the case with ARM.
|
268869 |
19-Jul-2014 |
markj |
Use a C wrapper for trap() instead of checking and calling the DTrace trap hook in assembly.
Suggested by: kib Reviewed by: kib (original version) X-MFC-With: r268600
|
268600 |
14-Jul-2014 |
markj |
Invoke the DTrace trap handler before calling trap() on amd64. This matches the upstream implementation and helps ensure that a trap induced by tracing fbt::trap:entry is handled without recursively generating another trap.
This makes it possible to run most (but not all) of the DTrace tests under common/safety/ without triggering a kernel panic.
Submitted by: Anton Rang <anton.rang@isilon.com> (original version) Phabric: D95
|
267759 |
23-Jun-2014 |
markj |
Fix a couple of bugs on amd64 when fetching probe arguments beyond the first five for probes entered through a UD fault (i.e. FBT probes).
Specifically, handle the fact that dtrace_invop_callsite must be 16 byte-aligned and thus may not immediately follow the call to dtrace_invop() in dtrace_invop_start(). Also fetch register arguments and the stack pointer through a struct trapframe instead of a struct reg.
PR: 191260 Submitted by: luke.tw@gmail.com MFC after: 3 weeks
|
262542 |
27-Feb-2014 |
markj |
Move some files that are identical on i386 and amd64 to an x86 subdirectory rather than keeping duplicate copies.
Discussed with: avg MFC after: 1 week
|
257417 |
31-Oct-2013 |
markj |
Remove references to an unused fasttrap probe hook, and remove the corresponding x86 trap type. Userland DTrace probes are currently handled by the other fasttrap hooks (dtrace_pid_probe_ptr and dtrace_return_probe_ptr).
Discussed with: rpaulo
|
256822 |
21-Oct-2013 |
markj |
When fetching function arguments out of a frame on amd64, explicitly select the register based on the argument index rather than relying on the fields in struct reg to be in the right order. This assumption is incorrect on FreeBSD and generally led to bogus argument values for the sixth argument of PID and USDT probes; the first five are passed directly to dtrace_probe() via the fasttrap trap handler and so were correctly handled.
MFC after: 2 weeks
|
253772 |
29-Jul-2013 |
avg |
dtrace disassembler: take the latest/last CDDL code from OpenSolaris
OpenSolaris version is: 13108:33bb8a0301ab 6762020 Disassembly support for Intel Advanced Vector Extensions (AVX)
This corresponds to Illumos-gate (github) version ab47273fedff893c8ae22ec39ffc666d4fa6fc8b
MFC after: 3 weeks
|
251238 |
02-Jun-2013 |
markj |
SDT probes can directly pass up to five arguments as arguments to dtrace_probe(). Arguments beyond these five must be obtained in an architecture-specific way; this can be done through the getargval provider method, and through dtrace_getarg() if getargval isn't overridden.
This change fixes two off-by-one bugs in the way these arguments are fetched in FreeBSD's DTrace implementation. First, the SDT provider must set the aframes parameter to 1 when creating a probe. The aframes parameter controls the number of frames that dtrace_getarg() will step over in order to find the frame containing the extra arguments. On FreeBSD, dtrace_getarg() is called in SDT probe context via
dtrace_probe()->dtrace_dif_emulate()->dtrace_dif_variable->dtrace_getarg()
so aframes must be 3 since the arguments are in dtrace_probe()'s frame; it was previously being called with a value of 2 instead. illumos uses a different aframes value for SDT probes, but this is because illumos SDT probes fire by triggering the #UD fault handler rather than calling dtrace_probe() directly.
The second bug has to do with the way arguments are grabbed out dtrace_probe()'s frame on amd64. The code currently jumps over the first stack argument and retrieves the rest of them using a pointer into the stack. This works on i386 because all of dtrace_probe()'s arguments will be on the stack and the first argument is the probe ID, which should be ignored. However, it is incorrect to ignore the first stack argument on amd64, so we correct the pointer used to access the arguments.
MFC after: 2 weeks
|
238552 |
17-Jul-2012 |
gnn |
Change UL to ULL since time is 32 bits.
Pointed out by: avg@ MFC after: 2 weeks
|
238537 |
16-Jul-2012 |
gnn |
Add support for walltimestamp in DTrace.
Submitted by: Fabian Keil MFC after: 2 weeks
|
238169 |
06-Jul-2012 |
avg |
r237748 continuation: fix nopw (0f 1f) behavior with respect to modifiers
To do: proper merge with Illumos vendor area.
Reported by: emaste Tested by: emaste Obtained from: Illumos commit 13442:4adbe6de60c8 MFC after: 5 days
|
238168 |
06-Jul-2012 |
avg |
r237748 continuation: segment-override prefixes are not invalid in long mode
Update DTrace disassembler accordingly. The code to treat the prefixes as null prefixes was already in place. Although in practice compilers seem to generate only cs-prefix for use in long NOPs, the same treatment is applied to all of cs, ds, es, ss for consistency.
Reported by: emaste Tested by: emaste Obtained from: Illumos commit 13442:4adbe6de60c8 (+ local changes) MFC after: 5 days
|
237748 |
29-Jun-2012 |
avg |
dtrace instruction decoder: add 0x0f 0x1f NOP opcode support
According to the AMD manual the whole range from 0x09 to 0x1f are NOPs. Intel manual mentions only 0x1f. Use only Intel one for now, it seems to be the one actually generated by compilers. Use gdb mnemonic for the operation: "nopw".
[1] AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions [2] Software Optimization Guide for AMD Family 10h Processors [3] Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z
Tested by: Fabian Keil <freebsd-listen@fabiankeil.de> (earlier version) MFC after: 3 days
|
236567 |
04-Jun-2012 |
gnn |
Integrate a fix for a very odd signal delivery problem found by Bryan Cantril and others in the Solaris/Illumos version of DTrace.
Obtained from: https://www.illumos.org/issues/789 MFC after: 2 weeks
|
236566 |
04-Jun-2012 |
zml |
Fix DTrace TSC skew calculation:
The skew calculation here is exactly backwards. We were able to repro it on a multi-package ESX server running a FreeBSD VM, where the TSCs can be pretty evil.
MFC after: 1 week
Submitted by: Jeff Ford <jeffrey.ford2@isilon.com> Reviewed by: avg, gnn
|
223758 |
04-Jul-2011 |
attilio |
With retirement of cpumask_t and usage of cpuset_t for representing a mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.
Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).
This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement.
MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast
|
222813 |
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
221740 |
10-May-2011 |
avg |
dtrace: remove unused code
Which is also useless, IMO.
MFC after: 5 days
|
220433 |
07-Apr-2011 |
jkim |
Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).
|
218909 |
21-Feb-2011 |
brucec |
Fix typos - remove duplicate "the".
PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
|
216251 |
07-Dec-2010 |
avg |
dtrace_xcall: no need for special handling of curcpu
smp_rendezvous_cpus alreadt does the right thing in a very similar fashion, so the code was kind of duplicating that.
MFC after: 3 weeks
|
216250 |
07-Dec-2010 |
avg |
dtrace_gethrtime_init: pin to master while examining other CPUs
Also use pc_cpumask to be future-friendly.
Reviewed by: jhb MFC after: 2 weeks
|
211608 |
22-Aug-2010 |
rpaulo |
Kernel DTrace support for: o uregs (sson@) o ustack (sson@) o /dev/dtrace/helper device (needed for USDT probes)
The work done by me was: Sponsored by: The FreeBSD Foundation
|
211607 |
22-Aug-2010 |
rpaulo |
Add a function compatibility function dtrace_instr_size_isa() that on FreeBSD does the same as dtrace_dis_isize().
Sponsored by: The FreeBSD Foundation
|
209059 |
11-Jun-2010 |
jhb |
Update several places that iterate over CPUs to use CPU_FOREACH().
|
195710 |
15-Jul-2009 |
avg |
dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds
Currently dtrace_gethrtime uses formula similar to the following for converting TSC ticks to nanoseconds: rdtsc() * 10^9 / tsc_freq The dividend overflows 64-bit type and wraps-around every 2^64/10^9 = 18446744073 ticks which is just a few seconds on modern machines.
Now we instead use precalculated scaling factor of 10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately for each 32-bit half. This allows to avoid overflow of the dividend described above. The idea is taken from OpenSolaris. This has an added feature of always scaling TSC with invariant value regardless of TSC frequency changes. Thus the timestamps will not be accurate if TSC actually changes, but they are always proportional to TSC ticks and thus monotonic. This should be much better than current formula which produces wildly different non-monotonic results on when tsc_freq changes.
Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init() to make it identical to the i386 twin.
PR: kern/127441 Tested by: Thomas Backman <serenity@exscape.org> Reviewed by: jhb Discussed with: current@, bde, gnn Silence from: jb Approved by: re (gnn) MFC after: 1 week
|
194850 |
24-Jun-2009 |
avg |
dtrace/amd64: fix virtual address checks
On amd64 KERNBASE/kernbase does not mean start of kernel memory. This should fix a KASSERT panic in dtrace_copycheck when copyin*() is used in D program. Also make checks for user memory a bit stricter.
Reported by: Thomas Backman <serenity@exscape.org> Submitted by: wxs (kaddr part) Tested by: Thomas Backman (prototype), wxs Reviewed by: alc (concept), jhb, current@ Aprroved by: jb (concept) MFC after: 2 weeks PR: kern/134408
|
179237 |
23-May-2008 |
jb |
Custom DTrace kernel module files plus FreeBSD-specific DTrace providers.
|