#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
292943 |
|
30-Dec-2015 |
marius |
- (Ab)use udivx for dividing the u_int pc_cpuid when implementing CPU_ISSET(), CPU_SET etc. in sparc64 asm. This approach has the benefit of not clobbering %y, allowing to revert r222827 and partially r222828. - In r222828, CATR() already was changed to use the equivalent of PCPU_GET(cpuid) instead of the MD module ID for KTR_CPU, so belatedly also catch up with the C side of ktr(9). Originally, in r203838 CATR() was moved away from directly reading the module ID or equivalent as that became impractical with other CPU types than USI/II supported. With r222828 in place, per-CPU data generally is set up soon enough, though, that employing PCPU things in ktr(9) also for use during early stages works. - Unfortunately, an exception to the latter is the ktr(9) use in pmap_bootstrap(), which actually is run so early that even checking for bootverbose being set via the loader doesn't work. Consequently, replace the ktr(9) use in pmap_bootstrap() with OF_printf(9) and put it under #ifdef DIAGNOSTIC instead.
MFC after: 3 days
|
#
285627 |
|
16-Jul-2015 |
zbb |
Fix KSTACK_PAGES issue when the default value was changed in KERNCONF
If KSTACK_PAGES was changed to anything alse than the default, the value from param.h was taken instead in some places and the value from KENRCONF in some others. This resulted in inconsistency which caused corruption in SMP envorinment.
Ensure all places where KSTACK_PAGES are used the opt_kstack_pages.h is included.
The file opt_kstack_pages.h could not be included in param.h because was breaking the toolchain compilation.
Reviewed by: kib Obtained from: Semihalf Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3094
|
#
225899 |
|
01-Oct-2011 |
marius |
Also allocate space for the PIL counters. Given that no machine actually uses IV_MAX interrupt vectors this wasn't a problem in practice though.
|
#
224216 |
|
19-Jul-2011 |
attilio |
On 64 bit architectures size_t is 8 bytes, thus it should use an 8 bytes storage. Fix the sintrcnt/sintrnames specification.
No MFC is previewed for this patch.
Reported, reviewed and tested by: marcel Approved by: re (kib)
|
#
224187 |
|
18-Jul-2011 |
attilio |
- Remove the eintrcnt/eintrnames usage and introduce the concept of sintrcnt/sintrnames which are symbols containing the size of the 2 tables. - For amd64/i386 remove the storage of intr* stuff from assembly files. This area can be widely improved by applying the same to other architectures and likely finding an unified approach among them and move the whole code to be MI. More work in this area is expected to happen fairly soon.
No MFC is previewed for this patch.
Tested by: pluknet Reviewed by: jhb Approved by: re (kib)
|
#
223721 |
|
02-Jul-2011 |
marius |
UltraSPARC-IV CPUs seem to be affected by a not publicly documented erratum causing them to trigger stray vector interrupts accompanied by a state in which they even fault on locked TLB entries. Just retrying the instruction in that case gets the CPU back on track though. OpenSolaris also just ignores a certain number of stray vector interrupts. While at it, implement the stray vector interrupt handling for SPARC64-VI which use these for indicating uncorrectable errors in interrupt packets.
|
#
223718 |
|
02-Jul-2011 |
marius |
Using .comm to declare intrnames and eintrnames causes binutils 2.17.50 to merge the two.
|
#
222840 |
|
07-Jun-2011 |
marius |
- For the case when tl1_align(_trap) is used to call rsf_fatal via RSF_FATAL we need to switch to alternate globals for KSTACK_CHECK just like tl1_data_excptn(_trap) does. This is more or less cosmetic because in case RSF_FATAL is called we're already heading south. - Correct an END(). - Read the window state from the correct register for a CATR().
|
#
222828 |
|
07-Jun-2011 |
marius |
Adapt CATR() to r222813. This is somewhat tricky as we can't afford using more than three temporary register in several places CATR() is used so this code trades instructions in for registers. Actually, this still isn't sufficient and CATR() has the side-effect of clobbering %y. Luckily, with the current uses of CATR() this either doesn't matter or we are able to (save and) restore it. Now that there's only one use of AND() and TEST() left inline these.
|
#
220939 |
|
22-Apr-2011 |
marius |
Correct spelling in comments.
Submitted by: brucec
|
#
217514 |
|
17-Jan-2011 |
marius |
In order to save instructions the MMU trap handlers assumed that the kernel TSB is located within the 32-bit address space, which held true as long as we were using virtual addresses magic-mapped before the location of the kernel for addressing it. However, with r216803 in place when possible we address it via its physical address instead, which on machines like Sun Fire V880 have no physical memory in the 32-bit address space at all requires to use 64-bit addressing. When using physical addressing it still should be safe to assume that we can just ignore the lowest 10 bits of the address as a minor optimization as we did before r216803.
|
#
216803 |
|
29-Dec-2010 |
marius |
On UltraSPARC-III+ and greater take advantage of ASI_ATOMIC_QUAD_LDD_PHYS, which takes an physical address instead of an virtual one, for loading TTEs of the kernel TSB so we no longer need to lock the kernel TSB into the dTLB, which only has a very limited number of lockable dTLB slots. The net result is that we now basically can handle a kernel TSB of any size and no longer need to limit the kernel address space based on the number of dTLB slots available for locked entries. Consequently, other parts of the trap handlers now also only access the the kernel TSB via its physical address in order to avoid nested traps, as does the PMAP bootstrap code as we haven't taken over the trap table at that point, yet. Apart from that the kernel TSB now is accessed via a direct mapping when we are otherwise taking advantage of ASI_ATOMIC_QUAD_LDD_PHYS so no further code changes are needed. Most of this is implemented by extending the patching of the TSB addresses and mask as well as the ASIs used to load it into the trap table so the runtime overhead of this change is rather low. Currently the use of ASI_ATOMIC_QUAD_LDD_PHYS is not yet enabled on SPARC64 CPUs due to lack of testing and due to the fact it might require minor adjustments there. Theoretically it should be possible to use the same approach also for the user TSB, which already is not locked into the dTLB, avoiding nested traps. However, for reasons I don't understand yet OpenSolaris only does that with SPARC64 CPUs. On the other hand I think that also addressing the user TSB physically and thus avoiding nested traps would get us closer to sharing this code with sun4v, which only supports trap level 0 and 1, so eventually we could have a single kernel which runs on both sun4u and sun4v (as does Linux and OpenBSD).
Developed at and committed from: 27C3
|
#
210176 |
|
16-Jul-2010 |
mav |
Allocate proper ammount of memory for interrupt names on sparc64 and sun4v, same as done on other architectures. This removes garbage from `vmstat -ia` output.
Reviewed by: marius@
|
#
205409 |
|
21-Mar-2010 |
marius |
- The firmware of Sun Fire V1280 has a misfeature of setting %wstate to 7 which corresponds to WSTATE_KMIX in OpenSolaris whenever calling into it which totally screws us even when restoring %wstate afterwards as spill/fill traps can happen while in OFW. The rather hackish OpenBSD approach of just setting the equivalent of WSTATE_KERNEL to 7 also is no option as we treat %wstate as a bit field. So in order to deal with this problem actually implement spill/fill handlers for %wstate 7 which just act as the WSTATE_KERNEL ones except of theoretically also handling 32-bit, turn off interrupts completely so we don't even take IPIs while in OFW which should ensure we only take spill/fill traps at most and restore %wstate after calling into OFW once we have taken over the trap table. While at it, actually set WSTATE_{,PROM}_KMIX before calling into OFW just like OpenSolaris does, which should at least help testing this change on non-V1280. - Remove comments referring to the %wstate usage in BSD/OS. - Remove the no longer used RSF_ALIGN_RETRY macro. - Correct some trap table addresses in comments. - Ensure %wstate is set to WSTATE_KERNEL when taking over the trap table. - Ensure PSTATE_AM is off when entering or exiting to OFW as well as that interrupts are also completely off when exiting to OFW as the firmware trap table shouldn't be used to handle our interrupts.
|
#
182877 |
|
08-Sep-2008 |
marius |
USIII and beyond CPUs have stricter requirements when it comes to synchronization needed after stores to internal ASIs in order to make side-effects visible. This mainly requires the MEMBAR #Sync after such stores to be replaced with a FLUSH. We use KERNBASE as the address to FLUSH as it is guaranteed to not trap. Actually, the USII synchronization rules also already require a FLUSH in pretty much all of the cases changed. We're also hitting an additional USIII synchronization rule which requires stores to AA_IMMU_SFSR to be immediately followed by a DONE, FLUSH or RETRY. Doing so triggers a RED state exception though so leave the MEMBAR #Sync. Linux apparently also has gotten away with doing the same for quite some time now, apart from the fact that it's not clear to me why we need to clear the valid bit from the SFSR in the first place.
Reviewed by: nwhitehorn
|
#
182774 |
|
04-Sep-2008 |
marius |
When determining whether we trapped while in the PROM don't only check for addresses below the PROM range but also those above.
|
#
182743 |
|
03-Sep-2008 |
marius |
Additionally clear the STICK bit in the SOFTINT register when receiving a PIL_TICK interrupt. This change was erroneously omitted in r182730.
|
#
182020 |
|
22-Aug-2008 |
marius |
cosmetic changes and style fixes
|
#
181701 |
|
13-Aug-2008 |
marius |
cosmetic changes and style fixes
|
#
172066 |
|
06-Sep-2007 |
marius |
o Revamp the sparc64 interrupt code in order to be able to interface with the INTR_FILTER-enabled MI code. Basically this consists of registering an interrupt controller (of which there can be multiple and optionally different ones either per host-to-foo bridge or shared amongst host-to-foo bridges in any one machine) along with an interrupt vector as specific argument for all the interrupt vectors used by a given host-to-foo bridge (roughly similar to registering interrupt sources on amd64 and i386), providing functions to enable, clear and disable the interrupts of the children beneath the bridge. This also includes: - No longer entering a critical section in tl0_intr() and tl1_intr() for executing interrupt handlers but rather let the handlers enter it themselves so in the case of intr_event_handle() we don't enter a nested critical section. - Adding infrastructure for binding delivery of interrupt vectors to specific CPUs which later on can be interfaced with the code from amd64/i386 for binding interrupts to specific CPUs. - Getting rid of the wrapper hack introduced along the lines of the API changes for INTR_FILTER which as a side-effect caused interrupts associated with ithread handlers only to get the elevated priority of those associated with filters ("fast handlers") (this removes the hack also in the non-INTR_FILTER case). - Disabling (by not clearing) an interrupt in the interrupt controller until all associated handlers have been executed, which is crucial for the typical locking strategy of NIC drivers in order to work correctly in case of shared interrupts. This was a more or less theoretical problem on sparc64 though, as shared interrupts are rather uncommon there except for the on-board SCCs and UARTs. Note that due to the behavior of at least of some of the interrupt controllers used on sparc64 an enable+EOI instead of a disable+EOI approach (as implied by the INTR_FILTER MI code and implemented on other architectures) is used as the latter can cause lost interrupts or in the worst case interrupt starvation. o Correct a typo in sbus_alloc_resource() which caused (pass-through) allocations to only work down to the grandchildren of the bus, which wasn't a real problem so far as we don't support any devices which are great-grandchildren or greater of a U2S bridge, yet. o In fhc(4) use bus_{read,write}_4() instead of bus_space_{read,write}_4() in order to get rid of sc_bh and sc_bt in the fhc_softc. Also get rid of some other unneeded members in fhc_softc.
Reviewed by: marcel (earlier version) Approved by: re (kensmith)
|
#
166105 |
|
19-Jan-2007 |
marius |
Convert the remainder of the low hanging fruits regarding including headers in .S directly rather than getting to their macros through genassym.c/assym.s so there are less headers genassym.c has to be kept in sync with. While at it fix some stytle(9) bugs (indentation, prototype format, sort headers, etc) and remove trailing whitespace.
|
#
157825 |
|
17-Apr-2006 |
marius |
- Since critical sections no longer raise the processor interrupt level to above what's used for fast interrupts, only interrupts with the level of the interrupt which led to calling intr_fast() (which is used with both fast and ithread interrupts) are blocked while in that function. Thus intr_fast() can be preempted by a fast interrupt (which are of a higher level than ithread interrupts) while servicing an ithread interrupt. This can lead to a stale pointer to the head of the active interrupt requests list when back in the ithread interrupt invocation of intr_fast(), in turn resulting in corruption of the interrupt request lists and consequently in a panic. Solve this be turning off interrupts in intr_fast() before reading the pointer to the head of the active list rather than after. [1] - Add a KASSERT in intr_fast() which asserts that ir_func is non-zero before calling it. [1] - Increment interrupt stats after calling the handlers rather than before. This reduces the delay until direct and fast handlers are serviced, in my testings by 30% on average for the direct tick interrupt handler, in turn resulting in less clock drift.
PR: 94778 [1] Submitted by: Andrew Belashov [1] MFC after: 2 weeks
|
#
155839 |
|
19-Feb-2006 |
marius |
- Don't bother traversing trap frames in stack_save(). This fixes panics when option DEBUG_LOCKS is used. Trap frames are determined by checking whether the caller was one of the tl0_*() or tl1_*() asm functions via a newly added pair of dummy symbols in exception.S which mark the begin and end of these functions. The tl_trap_* pair marks those in the special .trap section and the tl_text_* in the regular .text section. Because of their performance penalty db_search_symbol()/db_symbol_values() and linker_ddb_search_symbol()/linker_ddb_symbol_values() aren't used here for determining the caller, with db_search_symbol()/db_symbol_values() additionally not being reentrant. - For consistency, change db_backtrace() to also use the new markers for determining the tl0_*() and tl1_*() asm functions instead of bcmp()'ing the symbol name. - Use FBSDID in db_trace.c.
PR: 93226 Based on a patch by: Antoine Brodin <antoine.brodin@laposte.net> Ok'ed by: jhb
|
#
154419 |
|
15-Jan-2006 |
kris |
Correct typos (s/OFERFLOW/OVERFLOW/).
Reviewed by: jhb
|
#
145153 |
|
16-Apr-2005 |
marius |
- MFi386: sys/i386/i386/intr_machdep.c rev. 1.11 Don't use atomic ops to increment interrupt stats. On sparc64 this reduces delay until tick interrupts are service by 1/10th on average. In turn this reduces the clock drift caused by these delays so there's less drift which has to be compensated in tick_hardclock(). This includes switching from atomically incrementing the global cnt.v_intr to the asm equivalent of PCPU_LAZY_INC(cnt.v_intr) in exception.S - Correct some comments to match the registers actually used. - Correct some format specifiers, interrupt levels passed in are u_int. - Use FBSDID.
Ok'ed by: jhb
|
#
143073 |
|
02-Mar-2005 |
marius |
Remove the transition aid for the change of the sparc64 default system call vector which was added in rev. 1.52. This change was done way before sparc64 switched to a 64-bit time_t so all binaries are expected to have been recompiled by now.
|
#
117658 |
|
15-Jul-2003 |
jmg |
add support for interrupt counting on sparc64. This copies part of the code from i386. The code has a slight bogon that interrupts are counted twice. Once on the ithread dispatch and once on the dispatch for the vector
vmstat -i and systat -vm now contains interrupt counts.
Reviewed by: jake
|
#
116589 |
|
19-Jun-2003 |
jake |
Avoid using v8 opcodes; use ba instead of b for unconditional branches.
|
#
114257 |
|
29-Apr-2003 |
jake |
Allow fast instruction and data access mmu miss traps to be handled by user trap handlers.
|
#
114188 |
|
28-Apr-2003 |
jake |
- Fix placement of cvs ids in previous commit to match .S files in libc. - gcc uses 32 byte alignment for functions regardless of profiling, so follow suit.
|
#
114085 |
|
26-Apr-2003 |
obrien |
I was wrong, the ENTRY bits in asm.h did have a purpose -- for userland. Restore the bits and remove them from asmacros.h. *.S will now be asm.h consumers.
Approved by: jake
|
#
113024 |
|
03-Apr-2003 |
jake |
Add support for saving and restoring kernel floating point state. The state will be saved if we context switch as a result of an interrupt which occured while using the floating point registers in the kernel (which actually can't happen right now). This allows fp disabled traps in the kernel, which normally shouldn't happen, so make sure the trapping code is what we expect it is.
|
#
112924 |
|
01-Apr-2003 |
jake |
- Add a flags field to struct pcb. Use this to keep track of wether or not the pcb has floating point registers saved in it. - Implement get_mcontext and set_mcontext.
|
#
112920 |
|
01-Apr-2003 |
jake |
- Rename pcb_fpstate to pcb_ufp (user floating point), and change it to a simple array of 64 ints. - Use a critical section when saving floating point state in cpu_fork instead of sched_lock.
|
#
111032 |
|
17-Feb-2003 |
julian |
Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind.
Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@
|
#
109860 |
|
26-Jan-2003 |
jake |
Merge some code paths back together so that we only instantiate 1 copy of the user tlb fault handlers.
|
#
109810 |
|
24-Jan-2003 |
jake |
Moved some (gas) macros up so they can be used in more places.
|
#
108379 |
|
28-Dec-2002 |
jake |
Use the meaningful mnemonics for ancillary state registers now that gas is invoked properly to understand them.
%asr19 -> %gsr %asr20 -> %set_softint %asr21 -> %clear_softint
|
#
108377 |
|
28-Dec-2002 |
jake |
- Moved storing %g1-%g5 in the trapframe until after interrupts are enabled. - Restore %g6 and %g7 for kernel traps if we are returning to prom code. This allows complex traps (ones that call into C code) to be handled from the prom.
|
#
108374 |
|
28-Dec-2002 |
jake |
Pass 0 in %o1 to tl0_trap for all non-interrupt traps. This will be used to pass the pil when tl0_trap also handles interrupts.
|
#
108245 |
|
23-Dec-2002 |
jake |
- Change the way the direct mapped region is implemented to be generally useful for accessing more than 1 page of contiguous physical memory, and to use 4mb tlb entries instead of 8k. This requires that the system only use the direct mapped addresses when they have the same virtual colour as all other mappings of the same page, instead of being able to choose the colour and cachability of the mapping. - Adapt the physical page copying and zeroing functions to account for not being able to choose the colour or cachability of the direct mapped address. This adds a lot more cases to handle. Basically when a page has a different colour than its direct mapped address we have a choice between bypassing the data cache and using physical addresses directly, which requires a cache flush, or mapping it at the right colour, which requires a tlb flush. For now we choose to map the page and do the tlb flush.
This will allows the direct mapped addresses to be used for more things that don't require normal pmap handling, including mapping the vm_page structures, the message buffer, temporary mappings for crash dumps, and will provide greater benefit for implementing uma_small_alloc, due to the much greater tlb coverage.
|
#
108195 |
|
23-Dec-2002 |
jake |
- Fix a bug where the faulting address for an mmu miss could sometimes be clobbered due to some debug code. This was harmless and just superfluous soft faults. - Update some comments.
|
#
106050 |
|
27-Oct-2002 |
jake |
Don peril sensitive sun glasses and change the default system call vector for sparc64 from trap #9 to trap #65. This is one of the ABI "blessed" system call vectors and is different from any other system that we might want to emulate, making the emulation easier by reducing the number of code paths that need to be shared. Compatibility with old applications is provided with COMPAT_FREEBSD4. Add defines for a few special traps that we may need to implement for compatibility with 32bit applications, and add comments on which vectors are used for what in other systems, and which are available. Pass magic flags to trap() for deprecated or unimplemented system call vectors so they will deliver SIGSYS instead of SIGILL.
This piggy backs nicely with the recent sigaction(2) system call number change, and provided the rules are followed for upgrading past it, this change should not be noticed.
|
#
105995 |
|
26-Oct-2002 |
jake |
Remove an unused macro.
|
#
105733 |
|
22-Oct-2002 |
jake |
- Expand struct trapframe to 256 bytes, make all fields fixed width and the same size. Add some fields that previously overlapped with something else or were missing. - Make struct regs and struct mcontext (minus floating point) the same as struct trapframe so converting between them is easy (null). - Add space for saving floating point state to struct mcontext. This requires that it be 64 byte aligned. - Add assertions that none of these structures change size, as they are part of the ABI. - Remove some dead code in sendsig(). - Save and restore %gsr in struct trapframe. Remember to restore %fsr. - Add some comments to exception.S.
|
#
105012 |
|
12-Oct-2002 |
jake |
Removed unused tl0_syscall.
|
#
104075 |
|
28-Sep-2002 |
jake |
Renamed intr_enqueue to intr_vector and intr_dequeue to intr_fast, to better reflect how they are called.
|
#
104074 |
|
27-Sep-2002 |
jake |
Moved most interrupt related code to a new file, interrupt.S.
|
#
103922 |
|
24-Sep-2002 |
jake |
Removed debug code.
|
#
103921 |
|
24-Sep-2002 |
jake |
Pass the function to call (trap or syscall) to tl0_trap and tl1_trap in %o2.
|
#
103919 |
|
24-Sep-2002 |
jake |
Rearrange tl1_trap slightly, also save and restore the out registers so that instruction emulation is possible in kernel mode.
|
#
103916 |
|
24-Sep-2002 |
jake |
Allocate stack space for the trapframe along with the normal register frame in the save instruction, rather than doing a separate sub.
|
#
103897 |
|
24-Sep-2002 |
jake |
Split user trap processing out into a separate routine so that traps which never result in user traps don't have to plow through it.
|
#
103784 |
|
22-Sep-2002 |
jake |
Call trap directly for exceptional cases that need more processing on return to usermode, rather than branching back to a label before the original call.
|
#
102040 |
|
18-Aug-2002 |
jake |
Add pmap support for user mappings of multiple page sizes (super pages). This supports all hardware page sizes (8K, 64K, 512K, 4MB), but only 8k pages are actually used as of yet.
|
#
101899 |
|
15-Aug-2002 |
jake |
Fix some confusion regarding traps that use mmu globals but don't really have any reason to; force alternat globals instead, which is what we want.
|
#
101653 |
|
10-Aug-2002 |
jake |
Auto size available kernel virtual address space based on phsyical memory size. This avoids blowing out kva in kmeminit() on large memory machines (4 gigs or more).
Reviewed by: tmm
|
#
100771 |
|
27-Jul-2002 |
jake |
Implement a direct mapped address region, like alpha and ia64. This basically maps all of physical memory 1:1 to a range of virtual addresses outside of normal kva. The advantage of doing this instead of accessing phsyical addresses directly is that memory accesses will go through the data cache, and will participate in the normal cache coherency algorithm for invalidating lines in our own and in other cpus' data caches. So we don't have to flush the cache manually or send IPIs to do so on other cpus. Also, since the mappings never change, we don't have to flush them from the tlb manually. This makes pmap_copy_page and pmap_zero_page MP safe, allowing the idle zero proc to run outside of giant.
Inspired by: ia64
|
#
97265 |
|
25-May-2002 |
jake |
Convert the interrupt queue from an array to a linked list. Implement intr_dequeue in asm so that it can easily be modified to do light weight context switching.
|
#
97263 |
|
24-May-2002 |
jake |
Try to handle "double faults" occuring at more trap levels (ie 4 :)).
|
#
96207 |
|
08-May-2002 |
jake |
Make a macro for the guts of tl0_immu_miss, like dmmu_miss and prot. Rearrange things slightly so that the contents of the tag access register are read and restored outside of the macros. The intention is to pass the page size to look up as an argument to the macros.
|
#
93389 |
|
29-Mar-2002 |
jake |
Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode.
Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64
|
#
93001 |
|
23-Mar-2002 |
jake |
Backout intrusive ktr traces in tlb fault handlers which have served their purpose.
|
#
92201 |
|
13-Mar-2002 |
jake |
Fix a bug where the wrong number of windows were copied for a failed fill on return to user mode. We may not have frame pointers setup for more than 1 on return from exec.
|
#
92200 |
|
13-Mar-2002 |
jake |
White space.
|
#
91531 |
|
01-Mar-2002 |
jake |
Use a better trace class for ktr traces in the tlb fault handlers, which are rather loud.
|
#
91316 |
|
26-Feb-2002 |
jake |
Apparently gcc3.1 is now using deprcated v8 instructions in v9 code due to them being faster in certain cases. Therefore we need to save and restore the v8 %y register around traps in kernel mode as well as traps in usermode.
Tested by: obrien, tmm
|
#
91246 |
|
25-Feb-2002 |
jake |
Implement a nested window state. This avoids attempting to spill a user window to the user stack while in a nested kernel trap. We do this for entry to the kernel from user mode, but if we get an interrupt in kernel mode while there are still user windows in the cpu, and we attempt to spill to the user stack, we may take too many nested traps and overflow the trap stack, causing a red state exception. This is needed by upcoming changes to allow the user tsb to not be locked in the tlb.
Reviewed by: tmm
|
#
91224 |
|
25-Feb-2002 |
jake |
Modify the tte format to not include the tlb context number and to store the virtual page number in a much more convenient way; all in one piece. This greatly simplifies the comparison for a matching tte, and allows the fault handlers to be much simpler due to not having to load wierd masks. Rewrite the tlb fault handlers to account for the new format. These are also written to allow faults on the user tsb inside of the fault handlers; the kernel fault handler must be aware of this and not clobber the other's registers. The faults do not yet occur due to other support that is needed (and still under my desk).
Bug fixes from: tmm
|
#
91158 |
|
23-Feb-2002 |
jake |
1. Setup the user stack pointer before returning to a user trap handler. If we don't do this here there's a 1 instruction race where an interrupt could come in and crash the user process due to having no stack. 2. Pass %fsr to the user trap handler in %l4. Since %fsr can only be loaded from or stored to memory, we need to do some contortions and temporarily save it to the alternate global stack. 3. Reload the pcb and pcpu registers for traps in kernel mode, for sanity.
Submitted by: tmm (1, 2)
|
#
89050 |
|
08-Jan-2002 |
jake |
Setup the normal global pcb register as well on entry from user land. Call critical_enter/critical_exit around (fast) interrupt handlers. All non-threaded interrupts are fast, and the threaded interrupt scheduler is itself a fast interrupt. Assert that an interrupt handler we are about to call is non-zero. Be paranoid about restoring the users global registers. Do it as the last thing before switching to alternate globals (when we magically get our preloaded registers back), and do it with interrupts disabled. Any kind of kernel trap when the globals are not setup properly is bad news. Don't save and restore the kernel g6, it invariably points to the current pcb now.
|
#
89049 |
|
08-Jan-2002 |
jake |
Adapt the vectored interrupt handler for receiving ipis. If the second data word in an interrupt packet is non-zero, it points to code to execute to handle the ipi, so jump to it instead of enqueueing the packet. It is unclear if we will need queued ipis. Interrupt g7 now points to pcpu, instead of to the per-cpu interrupt queue itself, so use that instead. Interrupt g6 is no longer reserved.
|
#
89048 |
|
08-Jan-2002 |
jake |
Use the per-cpu panic stack in the case of a fault with a bad kernel stack.
|
#
89047 |
|
08-Jan-2002 |
jake |
Remove ATOMIC_INC_INT macro which has moved elsewhere.
|
#
88785 |
|
01-Jan-2002 |
jake |
Add some more info to traces. Fix a potential race in setting up the per-cpu pointer if the special restore fails on return to user mode fails and we need to trap back into the kernel to fault in more stack. Remove debug code.
|
#
88784 |
|
01-Jan-2002 |
jake |
Ensure that the syscall trap vector is properly aligned.
|
#
88782 |
|
01-Jan-2002 |
jake |
Implement user trap delivery as specified by the sparc abi. This provides an efficient way for the kernel to bounce certain mundane traps back to userland for handling there. A user trap handler returns directly to the trapping user code, rather than going through the kernel again. Only a handful of instructions are actually executed in kernel mode. Implement sysarch(SPARC_UTRAP_INSTALL). Add code to handle sharing of the user trap table across forks and unsharing at exec.
This can be used to implement efficient tracking of floating point register usage in userland, fe by a thread library, and to handle alignment fault fixups and instruction emulation in userland, for which the code may need to be different for 32bit and 64bit binaries.
|
#
88781 |
|
01-Jan-2002 |
jake |
Add a panic stack, which is used as a known good stack when there is something wrong with the kernel stack. Add code to check the kernel stack pointer in various important places and try hard not to go down in flames if its wrong.
|
#
88780 |
|
01-Jan-2002 |
jake |
Add a soft trap for restoring the fpu registers from the pcb.
|
#
88779 |
|
01-Jan-2002 |
jake |
Fix long lines in the trap table due to the abi specificied trap types having overly long names.
|
#
88644 |
|
29-Dec-2001 |
jake |
Add .register directives for gcc3. Add macros to atomically increment an integer variable in the data section and to atomically set a bit in a tte. Note that the latter does not return the new value. Rewrite RESUME_SPILLFILL_MAGIC to use more sensical calculations, and to preserve all alternate globals religiously. Must now be called on alternate globals. Defer switching to the kernel stack until inside the syscall, trap, interrupt wrappers. Splitting the windows is all that's really urgent. Adapt to new trap types. Add %xcc where appropriate in order to not use v8 opcodes inadvertantly (which work fine). Modify the low level tlb fault handlers to operate on a tsb made up of ttes, not sttes. This effectively makes the tsb twice as large. After atomically updating tte bits in memory, also set the bit in the register that holds the data which will be loaded into the tlb. The macro returns the old value. Use the preloaded mmu global which holds the address of the current user tsb. Add back a low level protection fault handler instead of just punting into the vm system. This effectively saves a soft fault per COW fault. Add a trace to intr_enqueue. Pass arguments to the trap, interrupt, syscall wrappers in the out registers instead of some on the stack, some in registers. Use the preloaded alternate global pcb register.
|
#
87702 |
|
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
#
86519 |
|
18-Nov-2001 |
jake |
1. Fix a bug where the offsets of the alignment and mmu fault recorvery code in the window trap vectors were mixed up. All this did is cause unnecesary traps and look wierd in traces. Superfluous traps happen a lot in normal operation, so we are rather good at recovering from them. 2. Store the arguments for a ktr trace in the right place. 3. Use a generic trap vector for breakpoints. It should not be special. 4. Save the frame pointer in the trap frame for kernel traps if DDB is compiled in, otherwsie we don't save the out registers for kernel traps and stack traces can't go through nested traps. 5. Apply the same fix to the return from kernel mode trap code as for user mode traps. Ensure that the window we're returning to is the same one that we restore to by fiddling the cwp in the saved tstate. This requires that we transfer the values loaded from the trap frame into alternate globals before restore-ing, but doing so is not very expensive and not worth worrying about. Not changing the saved cwp can result in the register values magically changing on return from traps if we happen to have slept and the windows don't work out exactly the same. Fix the trace just before the retry to account for different register usage. 6. Use a SET macro for loading address constants rather than a variation of set and setx. set only works for 32 bit constants, while setx works for 64 bit constants as well, but produces bloated code when unnecessary. Gas always generates the canonical 2 register, 6 instruction form, even when it could be optimized; set uses 1 register and 2 instructions. At the moment we assume that the kernel binary is below 4GB so set is always sufficient, but the macro allows it to be configured. Note that this has nothing to do with 32 vs. 64 bit address space, it only applies to addresses of symbols which are known at compile/link time.
Submitted by: tmm (6)
|
#
85585 |
|
27-Oct-2001 |
jake |
Handle instruction access mmu miss faults in kernel mode. These can only be generated by non-preloaded klds.
|
#
85243 |
|
20-Oct-2001 |
jake |
Fix a bug in the kernel entry window handling where the wrong register was used. This resulted in bogus bad window traps (invalid wstate).
Add a trace to sfsr traps (alignment among other things).
Use KTR_TRAP instead of KTR_CT1.
Use the right registers when storing the values of various mmu registers into the trap frame. This fixes a bug where sometimes the context number reported by a fault would be garbage. Sometimes it would be zero for faults on user address space so the kernel would wrongly think that it was a fault on kernel address space and fail.
Use the preloaded registers in the vectored interrupt trap instead of reading pointers from memory. Remove traces due to register pressure and excess verbosity. We can probably still sneak in one trace. Remove some debug code.
Go back to using the tsb register during kernel page table lookups. This is the best way to not have to have the address of the kernel tsb be a compile time constant. We lie and say we have 1 page tsb when really its much larger. This way the hardware provides bits 13-22 of the virtual address (the lower 9 bits of the virtual page number) in the form of the address of the tte corresponding to the fault address in the (1 page) kernel tsb. With some clever arithmetic we can then get bits 22 and up from the tte tag and add them to the tte address in order to index massive tsbs (basically unlimited).
Add traps for physical address hardware watchpoints.
Don't try to pass the window state from the trap table entry point all the way down to the common trap code. Its too easy to clobber and reading it again doesn't cost much.
Fixup some traces.
Fiddle the cwp bits on return from the kernel to user mode so that the window we are returning to is always the same as the one we restore to in the trap code. Strictly speaking this is not necessary, it only affects return from fork and exec, but setting up the windows right would require hard coding the right cwp values in cpu_fork and setregs, basically hard coding the number of frames between syscall and tl0_ret. The result of getting it wrong is usually a spill to an invalid stack pointer; either 0 or pointing into kernel space. This should also alleviate the need to context switch the cwp.
Transfer the trap state from locals to alternate globals in the trap return code so that we can do a restore and rotate the windows before reloading the trap registers. If the restore fails we'll trap back into the kernel, so there's no point in loading the trap registers before hand. Its is crucial that the window trap recovery code not clobber the alternate globals.
|
#
84186 |
|
30-Sep-2001 |
jake |
Split the low level trap code into trap, interrupt and syscall, its easier and hopefully this code is done changing radically.
Don't use the mmu tlb register to address the kernel page table, nor the 8k pointer register. The hardware will do some of the page table lookup by storing the the base address in an internal register and calculating the address of the tte in the table. However it is limited to a 1 meg tsb, which only maps 512 megs. The kernel page table only has one level, so its easy to just do it by hand, which has the advantage of supporting abitrary amounts of kvm and only costs a few more instructions.
Increase kvm to 1 gig now that its easy to do so and so we don't waste most of a 4 meg page.
Fix some traces. Fix more proc locking.
Call tsb_stte_promote if we get a soft fault on a mapping in the upper levels of the tsb. If there is an invalid or unreferenced mapping in the primary tsb, it will be replaced.
Immediately fail for faults occuring in {f,s}uswintr.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82906 |
|
03-Sep-2001 |
jake |
Implement a slightly different window spill/fill algorithm for dealing with user windows in kernel mode. We split the windows using %otherwin, but instead of spilling user window directly to the pcb, we attempt to spill to user space. If this fails because a stack page is not resident (or the stack is smashed), the fault handler at tl 2 will detect the situation and resume at tl 1 again where recovery code can spill to the pcb. Any windows that have been saved to the pcb will be copied out to the user stack on return from kernel mode.
Add a first stab at 32 bit window handling. This uses much of the same recovery code as above because the alignment of the stack pointer is used to detect 32 bit code. Attempting to spill a 32 bit window to a 64 bit stack, or vice versa, will cause an alignment fault. The recovery code then changes the window state to vector to a 32 bit spill/fill handler and retries the faulting instruction.
Add ktr traces in useful places during trap processing.
Adjust comments to reflect new code and add many more.
|
#
82005 |
|
20-Aug-2001 |
jake |
Add support for splitting the register windows on entry to the kernel from usermode. The remaining user windows are spilled to the pcb as necessary. The user land window fault handlers fill directly from the pcb on return. Add system call entry points.
Submitted by: tmm
|
#
81380 |
|
10-Aug-2001 |
jake |
1. Add code to handle traps and interrupts from user mode. 2. Add spill and fill handlers for spills to the user stack on entry to the kernel. 3. Add code to handle instruction mmu misses from user mode. 4. Add code to handle level interrupts from kernel mode and vectored interrupt traps from either. 5. Save the pil in the trapframe on entry from kernel mode and restore it on return.
Submitted by: tmm (1, 2)
|
#
81337 |
|
09-Aug-2001 |
obrien |
The author isn't a [UC] Regents. Correct the copyright language.
|
#
81180 |
|
06-Aug-2001 |
jake |
Add trap handlers for dmmu faults from user mode, and for faults from accessing user address space in kernel mode.
|
#
81135 |
|
04-Aug-2001 |
tmm |
Add floating point context switching code for sparc64.
Reviewed by: jake
|
#
80709 |
|
31-Jul-2001 |
jake |
Flesh out the sparc64 port considerably. This contains: - mostly complete kernel pmap support, and tested but currently turned off userland pmap support - low level assembly language trap, context switching and support code - fully implemented atomic.h and supporting cpufunc.h - some support for kernel debugging with ddb - various header tweaks and filling out of machine dependent structures
|