#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
292943 |
|
30-Dec-2015 |
marius |
- (Ab)use udivx for dividing the u_int pc_cpuid when implementing CPU_ISSET(), CPU_SET etc. in sparc64 asm. This approach has the benefit of not clobbering %y, allowing to revert r222827 and partially r222828. - In r222828, CATR() already was changed to use the equivalent of PCPU_GET(cpuid) instead of the MD module ID for KTR_CPU, so belatedly also catch up with the C side of ktr(9). Originally, in r203838 CATR() was moved away from directly reading the module ID or equivalent as that became impractical with other CPU types than USI/II supported. With r222828 in place, per-CPU data generally is set up soon enough, though, that employing PCPU things in ktr(9) also for use during early stages works. - Unfortunately, an exception to the latter is the ktr(9) use in pmap_bootstrap(), which actually is run so early that even checking for bootverbose being set via the loader doesn't work. Consequently, replace the ktr(9) use in pmap_bootstrap() with OF_printf(9) and put it under #ifdef DIAGNOSTIC instead.
MFC after: 3 days
|
#
226054 |
|
06-Oct-2011 |
marius |
- Use atomic operations rather than sched_lock for safely assigning pm_active and pc_pmap for SMP. This is key to allowing adding support for SCHED_ULE. Thanks go to Peter Jeremy for additional testing. - Add support for SCHED_ULE to cpu_switch().
Committed from: 201110DevSummit
|
#
222813 |
|
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
#
205258 |
|
17-Mar-2010 |
marius |
- Add TTE and context register bits for the additional page sizes supported by UltraSparc-IV and -IV+ as well as SPARC64 V, VI, VII and VIIIfx CPUs. - Replace TLB_PCXR_PGSZ_MASK and TLB_SCXR_PGSZ_MASK with TLB_CXR_PGSZ_MASK which just is the complement of TLB_CXR_CTX_MASK instead of trying to assemble it from the page size bits which vary across CPUs. - Add macros for the remainder of the SFSR bits, which are useful for at least debugging purposes.
|
#
203185 |
|
30-Jan-2010 |
marius |
Implement handling of the third argument of cpu_switch(). This unbreaks sparc64 after r202889.
PR: 143215 MFC after: 1 week
|
#
182878 |
|
08-Sep-2008 |
marius |
For cheetah-class CPUs ensure that the dt512_0 is set to hold 8k pages for all three contexts and configure the dt512_1 to hold 4MB pages for them (e.g. for direct mappings). This might allow for additional optimization by using the faulting page sizes provided by AA_DMMU_TAG_ACCESS_EXT for bypassing the page size walker for the dt512 in the superpage support code.
Submitted by: nwhitehorn (initial patch)
|
#
182877 |
|
08-Sep-2008 |
marius |
USIII and beyond CPUs have stricter requirements when it comes to synchronization needed after stores to internal ASIs in order to make side-effects visible. This mainly requires the MEMBAR #Sync after such stores to be replaced with a FLUSH. We use KERNBASE as the address to FLUSH as it is guaranteed to not trap. Actually, the USII synchronization rules also already require a FLUSH in pretty much all of the cases changed. We're also hitting an additional USIII synchronization rule which requires stores to AA_IMMU_SFSR to be immediately followed by a DONE, FLUSH or RETRY. Doing so triggers a RED state exception though so leave the MEMBAR #Sync. Linux apparently also has gotten away with doing the same for quite some time now, apart from the fact that it's not clear to me why we need to clear the valid bit from the SFSR in the first place.
Reviewed by: nwhitehorn
|
#
181701 |
|
13-Aug-2008 |
marius |
cosmetic changes and style fixes
|
#
166105 |
|
19-Jan-2007 |
marius |
Convert the remainder of the low hanging fruits regarding including headers in .S directly rather than getting to their macros through genassym.c/assym.s so there are less headers genassym.c has to be kept in sync with. While at it fix some stytle(9) bugs (indentation, prototype format, sort headers, etc) and remove trailing whitespace.
|
#
129749 |
|
26-May-2004 |
tmm |
Move the per-CPU vmspace pointer fixup that is required before a struct vmspace is freed from cpu_sched_exit() to pmap_release().
This has the advantage of being able to rely on MI code to decide when a free should occur, instead of having to inspect the reference count ourselves.
At the same time, turn the per-CPU vmspace pointer into a pmap pointer, so that pmap_release() can deal with pmaps exclusively.
Reviewed (and embrassing bug spotted) by: jake
|
#
114188 |
|
28-Apr-2003 |
jake |
- Fix placement of cvs ids in previous commit to match .S files in libc. - gcc uses 32 byte alignment for functions regardless of profiling, so follow suit.
|
#
114085 |
|
26-Apr-2003 |
obrien |
I was wrong, the ENTRY bits in asm.h did have a purpose -- for userland. Restore the bits and remove them from asmacros.h. *.S will now be asm.h consumers.
Approved by: jake
|
#
113453 |
|
13-Apr-2003 |
jake |
- Move the routine for flushing all user mappings from the tlb from pmap to the cpu dependent files. It will need to be done differently for USIII. - Simplify the logic for detecting context rollovers. Instead of dealing with it when the next context switch would cause the context numbers to rollover, deal with it when they actually do rollover. - Move some things around in cpu_switch so that we only do 1 membar #Sync when switching address space, instead of 2. - Detect kernel threads by comparing the new vm space to vmspace0, instead if checking if the tlb context is 0. - Removed some debug code.
|
#
113024 |
|
03-Apr-2003 |
jake |
Add support for saving and restoring kernel floating point state. The state will be saved if we context switch as a result of an interrupt which occured while using the floating point registers in the kernel (which actually can't happen right now). This allows fp disabled traps in the kernel, which normally shouldn't happen, so make sure the trapping code is what we expect it is.
|
#
113021 |
|
03-Apr-2003 |
jake |
- Generally improve register usage in cpu_switch. Use the 'in' registers for temporaries relating to the state of the new process instead of the outs, so that functions can be called without fear of clobbering them. - Use savefpctx instead of rolling our own.
|
#
113019 |
|
03-Apr-2003 |
jake |
Don't assume the fp state is at offset 0 in the pcb.
|
#
112993 |
|
02-Apr-2003 |
peter |
Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch.
The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap.
Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default.
This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more.
The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it.
One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff.
Glanced at by: jake, jhb
|
#
112924 |
|
01-Apr-2003 |
jake |
- Add a flags field to struct pcb. Use this to keep track of wether or not the pcb has floating point registers saved in it. - Implement get_mcontext and set_mcontext.
|
#
112920 |
|
01-Apr-2003 |
jake |
- Rename pcb_fpstate to pcb_ufp (user floating point), and change it to a simple array of 64 ints. - Use a critical section when saving floating point state in cpu_fork instead of sched_lock.
|
#
112917 |
|
01-Apr-2003 |
jake |
Rename pcb_fp to pcb_sp, so as to not be confused with floating point state.
|
#
105733 |
|
22-Oct-2002 |
jake |
- Expand struct trapframe to 256 bytes, make all fields fixed width and the same size. Add some fields that previously overlapped with something else or were missing. - Make struct regs and struct mcontext (minus floating point) the same as struct trapframe so converting between them is easy (null). - Add space for saving floating point state to struct mcontext. This requires that it be 64 byte aligned. - Add assertions that none of these structures change size, as they are part of the ABI. - Remove some dead code in sendsig(). - Save and restore %gsr in struct trapframe. Remember to restore %fsr. - Add some comments to exception.S.
|
#
102040 |
|
18-Aug-2002 |
jake |
Add pmap support for user mappings of multiple page sizes (super pages). This supports all hardware page sizes (8K, 64K, 512K, 4MB), but only 8k pages are actually used as of yet.
|
#
99887 |
|
12-Jul-2002 |
jhb |
Set the thread state of the newly chosen to run thread to TDS_RUNNING in choosethread() in MI C code instead of doing it in in assembly in all the various cpu_switch() functions. This fixes problems on ia64 and sparc64.
Reviewed by: julian, peter, benno Tested on: i386, alpha, sparc64
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
91781 |
|
07-Mar-2002 |
jake |
Implement kthread context stealing. This is a bit of a misnomer because the context is not actually stolen, as it would be for i386. Instead of deactivating a user vmspace immediately when switching out, and recycling its tlb context, wait until the next context switch to a different user vmspace. In this way we can switch from a user process to any number of kernel threads and back to the same user process again, without losing any of its mappings in the tlb that would not already be knocked by the automatic replacement algorithm. This is not expected to have a measurable performance improvement on the machines we currently run on, but it sounds cool and makes the sparc64 port SMPng buzz word compliant.
|
#
91613 |
|
04-Mar-2002 |
jake |
Allocate tlb contexts on the fly in cpu_switch, instead of statically 1 to 1 with pmaps. When the context numbers wrap around we flush all user mappings from the tlb. This makes use of the array indexed by cpuid to allow a pmap to have a different context number on a different cpu. If the context numbers are then divided evenly among cpus such that none are shared, we can avoid sending tlb shootdown ipis in an smp system for non-shared pmaps. This also removes a limit of 8192 processes (pmaps) that could be active at any given time due to running out of tlb contexts.
Inspired by: the brown book Crucial bugfix from: tmm
|
#
91337 |
|
26-Feb-2002 |
jake |
Use pcpu.pc_cpumask instead of computing 1 << cpuid.
|
#
91336 |
|
26-Feb-2002 |
jake |
Add a macro for shift of an integer (1 << shift == sizeof). Move the pointer define to live alongside it. For kicks assert at compile time that they are correct. Use these instead of magic numbers.
|
#
91288 |
|
26-Feb-2002 |
jake |
Convert pmap.pm_context to an array of contexts indexed by cpuid. This doesn't make sense for SMP right now, but it is a means to an end.
|
#
91257 |
|
25-Feb-2002 |
jake |
Remove code to lock the user tsb into the tlb. We can handle faults on it now, as we do for normal wired kernel memory.
|
#
89043 |
|
08-Jan-2002 |
jake |
Set the normal global pcb register when context switching.
|
#
88645 |
|
29-Dec-2001 |
jake |
Use fprs to track floating register usage. Clear it once we've saved the registers so we don't uselessly save them over and over again for each context switch until another floating point instruction is executed. Use a non-specific tlb slot for the tsb, which needs to have a locked entry. Remove overly verbose traces.
|
#
86523 |
|
18-Nov-2001 |
jake |
1. Convert the tstate saved in the pcb to a pstate and test for PSTATE_PEF to determine if a process is using floating point. in order to avoid sign extending a 13 bit immediate. 2. We don't need to context switch cwp anymore, it is better to just fiddle the save tstate on return from traps. See exception.s 1.10 and 1.12. 3. Completely remove pcb_cwp. 4. Implement vmapbuf, vunmapbuf and vm_fault_quick. Completely remove TODOs from vm_machdep.c (yay!).
Submitted by: tmm (1, 3, 4) Obtained from: existing archs (4)
|
#
85239 |
|
20-Oct-2001 |
jake |
Use KTR_PROC instead of KTR_CT1 in traces.
|
#
84184 |
|
30-Sep-2001 |
jake |
Fix some traces. td->p_comm doesn't exist.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82909 |
|
03-Sep-2001 |
jake |
Add ktr traces to copy{in,out} and cpu_switch.
Context switch the cwp value. The register usage in cpu_switch will be updated shortly to better reflect the fact that the current window may change.
|
#
82013 |
|
20-Aug-2001 |
jake |
Save and restore %fprs and %y, which are unused by kernel code, but may be used by 32bit userland code. Implement cpu_throw().
Submitted by: tmm
|
#
81337 |
|
09-Aug-2001 |
obrien |
The author isn't a [UC] Regents. Correct the copyright language.
|
#
81184 |
|
06-Aug-2001 |
jake |
Handle switching switching mmu contexts and mapping the new primary tsb. Rework some register usage and code placement. Comment.
|
#
81135 |
|
04-Aug-2001 |
tmm |
Add floating point context switching code for sparc64.
Reviewed by: jake
|
#
80709 |
|
31-Jul-2001 |
jake |
Flesh out the sparc64 port considerably. This contains: - mostly complete kernel pmap support, and tested but currently turned off userland pmap support - low level assembly language trap, context switching and support code - fully implemented atomic.h and supporting cpufunc.h - some support for kernel debugging with ddb - various header tweaks and filling out of machine dependent structures
|