#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
223700 |
|
30-Jun-2011 |
marcel |
Change the management of nested faults by switching to physical addressing while reading or writing the trap frame. It's not possible to guarantee that the one translation cache entry that we depend on is not going to get purged by the CPU. We already know that global shootdowns (ptc.g and/or ptc.ga) can (and will) cause multiple TC entries to get purged and we initialize tried to handle that by serializing kernel entry with these operations. However, we need to serialize kernel exit as well.
But even if we can serialize, it appears that CPU threads within a core can affect each other's TC entries beyond the global shootdown. This would mean serializing any and all translatation cache updates with the threads in a core with the kernel entry and exit of any thread in that core. This is just too painful and complicated.
Since we already properly coded for the 2 nested faults that we can get, all we need to do is use those to obtain the physical address of the trap frame, switch to physical mode and in that way eliminate any further faults. The trap frame is already aligned to 1KB boundaries to make sure we don't cross the page boundary, this is safe to do.
We still need to serialize ptc.g or ptc.ga across CPUs because the platform can only have 1 such operation outstanding at the same time. We can now use a regular (spin) lock for this.
Also, it has been observed that we can get a nested TLB faults for region 7 virtual addresses. This was unexpected. For now, we enhance the nested TLB fault handler to deal with those as well, but it needs to be understood.
|
#
221893 |
|
14-May-2011 |
marcel |
Sharpening the saw: o Clobber the register that holds the restart token immediately after crossing the restart point. This prevents false positives (i.e. a nested exception that we don't know can happen and that is being treated as one we know by virtue of a lingering restart token). o Now that the bootstrap kernel stack is free, switch onto it and call trap() for nested traps that we don't know about. In trap we panic() so that we can analyze the condition.
|
#
221271 |
|
30-Apr-2011 |
marcel |
Stop linking against a direct-mapped virtual address and instead use the PBVM. This eliminates the implied hardcoding of the physical address at which the kernel needs to be loaded. Using the PBVM makes it possible to load the kernel irrespective of the physical memory organization and allows us to replicate kernel text on NUMA machines.
While here, reduce the direct-mapped page size to the kernel's page size so that we can support memory attributes better.
|
#
219758 |
|
18-Mar-2011 |
marcel |
o Move the IVT and supporting functions to the front of the text segment so that it's always mapped by the loader. o Change the alternate fault handlers to account for PBVM. Since currently the region is handled by the VHPT, no alternate faults will be generated for it.
|
#
209085 |
|
11-Jun-2010 |
marcel |
The ptc.g operation for the Mckinley and Madison processors has the side-effect of purging more than the requested translation. While this is not a problem in general, it invalidates the assumption made during constructing the trapframe on entry into the kernel in SMP configurations. The assumption is that only the first store to the stack will possibly cause a TLB miss. Since the ptc.g purges the translation caches of all CPUs in the coherency domain, a ptc.g executed on one CPU can cause a purge on another CPU that is currently running the critical code that saves the state to the trapframe. This can cause an unexpected TLB miss and with interrupt collection disabled this means an unexpected data nested TLB fault.
A data nested TLB fault will not save any context, nor provide a way for software to determine what caused the TLB miss nor where it occured. Careful construction of the kernel entry and exit code allows us to handle a TLB miss in precisely orchastrated points and thereby avoiding the need to wire the kernel stack, but the unexpected TLB miss caused by the ptc.g instructution resulted in an unrecoverable condition and resulting in machine checks.
The solution to this problem is to synchronize the kernel entry on all CPUs with the use of the ptc.g instruction on a single CPU by implementing a bare-bones readers-writer lock that allows N readers (= N CPUs entering the kernel) and 1 writer (= execution of the ptc.g instruction on some CPU). This solution wins over a rendez-vous approach by not interrupting CPUs with an IPI.
This problem has not been observed on the Montecito.
PR: ia64/147772 MFC after: 6 days
|
#
205433 |
|
22-Mar-2010 |
marcel |
Fix interrupt handling by extending the critical region so that preemption doesn't happen until after all pending interrupt have been services. While here again, simplify the EOI handling by doing it after we call the XIV-specific handlers, rather than in each of them. The original thought was that we may want to do an EOI first and the actual IPI handling next, but that's mostly a micro-optimization.
|
#
205234 |
|
16-Mar-2010 |
marcel |
Revamp the interrupt code based on the previous commit: o Introduce XIV, eXternal Interrupt Vector, to differentiate from the interrupts vectors that are offsets in the IVT (Interrupt Vector Table). There's a vector for external interrupts, which are based on the XIVs.
o Keep track of allocated and reserved XIVs so that we can assign XIVs without hardcoding anything. When XIVs are allocated, an interrupt handler and a class is specified for the XIV. Classes are: 1. architecture-defined: XIV 15 is returned when no external interrupt are pending, 2. platform-defined: SAL reports which XIV is used to wakeup an AP (typically 0xFF, but it's 0x12 for the Altix 350). 3. inter-processor interrupts: allocated for SMP support and non-redirectable. 4. device interrupts (i.e. IRQs): allocated when devices are discovered and are redirectable.
o Rewrite the central interrupt handler to call the per-XIV interrupt handler and rename it to ia64_handle_intr(). Move the per-XIV handler implementation to the file where we have the XIV allocation/reservation. Clock interrupt handling is moved to clock.c. IPI handling is moved to mp_machdep.c.
o Drop support for the Intel 8259A because it was broken. When XIV 0 is received, the CPU should initiate an INTA cycle to obtain the interrupt vector of the 8259-based interrupt. In these cases the interrupt controller we should be talking to WRT to masking on signalling EOI is the 8259 and not the I/O SAPIC. This requires adriver for the Intel 8259A which isn't available for ia64. Thus stop pretending to support ExtINTs and instead panic() so that if we come across hardware that has an Intel 8259A, so have something real to work with.
o With XIVs for IPIs dynamically allocatedi and also based on priority, define the IPI_* symbols as variables rather than constants. The variable holds the XIV allocated for the IPI.
o IPI_STOP_HARD delivers a NMI if possible. Otherwise the XIV assigned to IPI_STOP is delivered.
|
#
205014 |
|
11-Mar-2010 |
nwhitehorn |
Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms.
Reviewed by: kib, jhb
|
#
204184 |
|
21-Feb-2010 |
marcel |
Prefer I-units and M-units for nop instructions. This works around McKinley flaws. It also avoids using the F-unit in the kernel for no reason.
|
#
200240 |
|
07-Dec-2009 |
marcel |
In exception_save, write-back ar.rnat after switching the backing- store. Writing to ar.bspstore is defined to leave ar.rnat undefined.
PR: ia64/120315 MFC after: 3 days
|
#
199574 |
|
20-Nov-2009 |
marcel |
No need to include opt_kstack_pages.h, because KSTACK_PAGES is already defined through genassym.c
|
#
199566 |
|
20-Nov-2009 |
marcel |
Add a seatbelt to the Nested TLB Fault handler to give us a chance to panic when we have an unexpected TLB fault while interrupt collection is disabled. Use a token rather than the actual address of the restart point to avoid the need for the movl instruction. The token is arbitrary. For the drummers: it's based on a single paradiddle.
|
#
199502 |
|
18-Nov-2009 |
marcel |
opt_* headers are included using the quoted form.
|
#
172692 |
|
16-Oct-2007 |
marcel |
Set PTE_ACCESSED in the PTE and before inserting it in the VHPT. This avoids back-to-back faults for all TLB misses. This can be improved further in the future by also setting PTE_DIRTY for TLB misses for write accesses.
MFC after: 1 week
|
#
171739 |
|
06-Aug-2007 |
marcel |
Keep interrupts disabled while handling external interrupts. There's no advantage in allowing nested external interrupts. In fact, it leads to a potential stack overrun.
While here, put the interrupt vector in the trapframe, so as to compensate for the 36 cycle latency of reading cr.ivr.
Further simplify assembly code by dealing with ASTs from C.
Approved by: re (blanket)
|
#
171666 |
|
30-Jul-2007 |
marcel |
o Switch to physical addressing before dereferencing the VHPT bucket pointer. The virtual mapping may not be present in the translation cache. This will result in a nested TLB fault at a place we don't handle (and don't want to handle). o Make sure there's a stop after the rfi instruction, otherwise its behaviour is undefined. o Make sure we switch back to virtual addressing before doing a rfi. Behaviour is undefined otherwise.
Approved by: re (blanket)
|
#
171665 |
|
30-Jul-2007 |
marcel |
Add option EXCEPTION_TRACING, which enables KTR-like functionality for processor interruptions. This is especially useful to track unexpected nested TLB faults.
Approved by: re (blanket)
|
#
170026 |
|
27-May-2007 |
marcel |
Have the processor defer all faults and exceptions for control speculative loads. This at least makes control speculative loads work. In the future we should analyze which faults/exceptions we want to handle rather than defer to avoid having to call the recovery code when it's not strictly necessary.
|
#
169760 |
|
19-May-2007 |
marcel |
Add a level of indirection to the kernel PTE table. The old scheme allowed for 1024 PTE pages, each containing 256 PTEs. This yielded 2GB of KVA. This is not enough to boot a kernel on a 16GB box and in general too low for a 64-bit machine. By adding a level of indirection we now have 1024 2nd-level directory pages, each capable of supporting 2GB of KVA. This brings the grand total to 2TB of KVA.
|
#
148807 |
|
06-Aug-2005 |
marcel |
Improve SMP support: o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU uses to look up translations it can't find in the TLB. As such, the VHPT serves as a level 1 cache (the TLB being a level 0 cache) and best results are obtained when it's not shared between CPUs. The collision chain (i.e. the hash bucket) is shared between CPUs, as all buckets together constitute our collection of PTEs. To achieve this, the collision chain does not point to the first PTE in the list anymore, but to a hash bucket head structure. The head structure contains the pointer to the first PTE in the list, as well as a mutex to lock the bucket. Thus, each bucket is locked independently of each other. With at least 1024 buckets in the VHPT, this provides for sufficiently finei-grained locking to make the ssolution scalable to large SMP machines. o Add synchronisation to the lazy FP context switching. We do this with a seperate per-thread lock. On SMP machines the lazy high FP context switching without synchronisation caused inconsistent state, which resulted in a panic. Since the use of the high FP registers is not common, it's possible that races exist. The ia64 package build has proven to be a good stress test, so this will get plenty of exercise in the near future. o Don't use the local ID of the processor we want to send the IPI to as the argument to ipi_send(). use the struct pcpu pointer instead. The reason for this is that IPI delivery is unreliable. It has been observed that sending an IPI to a CPU causes it to receive a stray external interrupt. As such, we need a way to make the delivery reliable. The intended solution is to queue requests in the target CPU's per-CPU structure and use a single IPI to inform the CPU that there's a new entry in the queue. If that IPI gets lost, the CPU can check it's queue at any convenient time (such as for each clock interrupt). This also allows us to send requests to a CPU without interrupting it, if such would be beneficial.
With these changes SMP is almost working. There are still some random process crashes and the machine can hang due to having the IPI lost that deals with the high FP context switch.
The overhead of introducing the hash bucket head structure results in a performance degradation of about 1% for UP (extra pointer indirection). This is surprisingly small and is offset by gaining reasonably/good scalable SMP support.
|
#
135783 |
|
25-Sep-2004 |
marcel |
Move the IA-32 trap handling from trap() to ia32_trap(). Move the ia32_syscall() function along with it to ia32_trap.c. When COMPAT_IA32 is not defined, we'll raise SIGEMT instead.
|
#
135590 |
|
22-Sep-2004 |
marcel |
Redefine a PTE as a 64-bit integral type instead of a struct of bit-fields. Unify the PTE defines accordingly and update all uses.
|
#
134502 |
|
29-Aug-2004 |
marcel |
s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow the C calling convention or are otherwise not regular functions. This allows us to boot a profiling kernel.
|
#
121635 |
|
28-Oct-2003 |
marcel |
When switching the RSE to use the kernel stack as backing store, keep the RNAT bit index constant. The net effect of this is that there's no discontinuity WRT NaT collections which greatly simplifies certain operations. The cost of this is that there can be up to 504 bytes of unused stack between the true base of the kernel stack and the start of the RSE backing store. The cost of adjusting the backing store pointer to keep the RNAT bit index constant, for each kernel entry, is negligible.
The primary reasons for this change are: 1. Asynchronuous contexts in KSE processes have the disadvantage of having to copy the dirty registers from the kernel stack onto the user stack. The implementation we had so far copied the registers one at a time without calculating NaT collection values. A process that used speculation would not work. Now that the RNAT bit index is constant, we can block-copy the registers from the kernel stack to the user stack without having to worry about NaT collections. They will be in the right place on the user stack. 2. The ndirty field in the trapframe is now also usable in userland. This was previously not the case because ndirty also includes the space occupied by NaT collections. The value could be off by 8, depending on the discontinuity. Now that the RNAT bit index is contants, we have exactly the same number of NaT collection points on the kernel stack as we would have had on the user stack if we didn't switch backing stores. 3. Debuggers and other applications that use ptrace(2) can now copy the dirty registers from the kernel stack (using ptrace(2)) and copy them whereever they want them (onto the user stack of the inferior as might be the case for gdb) without having to worry about NaT collections in the same way the kernel doesn't have to worry about them.
There's a second order effect caused by the randomization of the base of the backing store, for it depends on the number of dirty registers the processor happened to have at the time of entry into the kernel. The second order effect is that the RSE will have a better cache utilization as compared to having the backing store always aligned at page boundaries. This has not been measured and may be in practice only minimally beneficial, if at all measurable.
|
#
119787 |
|
05-Sep-2003 |
marcel |
Fix a place where I forgot to change the code that checks whether we return to kernel or userland. This triggered a panic in a KSE application when TDF_USTATCLOCK was set in the case userland was interrupted, but we never called ast() on our way out. As such, we called ast() at some other time. Unfortunately, TDF_USTATCLOCK handling assumes running in the interrupt thread. This was not the case anymore.
To avoid making the same mistake later, interrupt() now returns to its caller whether we interrupted userland or not. This avoids that we have to duplicate the check in assembly, where it's bound to fall off the scope. Now we simply check the return value and call ast() if appropriate.
Run into this: davidxu
|
#
118563 |
|
06-Aug-2003 |
marcel |
o In revision 1.45 of exception.S we changed exception_restore to unconditionally restore ar.k7 (kernel memory stack) and ar.k6 (kernel register stack). I don't know what I was smoking then, but if you unconditionally restore ar.k6, you also want to compute its value unconditionally. By having the computation predicated and dependent on whether we return to user mode, we would end up writing junk (= invalid value for ar.bspstore) if we would return to kernel mode. But the whole point of the unconditional restoration was that there is a grey area where we still need to have ar.k6 restored. If we restore with a junk value, we would end up wedging the machine on the next interrupt. So, unconditionally calculate the value we unconditionally write to ar.k6.
o The previous braino was found while making the following change: We used to clear the lower 9 bits of the value we write to ar.k6. The meaning being that we know that the kernel register stack is at least 512 byte aligned and simply clearing the lower 9 bits allows us to return to a context of which we don't have dirty registers on the kernel stack, even though the context that entered the kernel does have dirty registers on the kernel stack. By masking-off the lower bits, we correctly obtain the base of the register stack without having to worry that we didn't actually reached the base while unwinding it. The change is to mask off the lower 13 bits, knowing that the kernel register stack is always 8KB aligned. The advantage is that we don't have to worry anymore if there's more than 512 bytes of dirty registers on the kernel stack. A situation that frequently occurs. In exec_setregs() in machdep.c:1.147 or older, we had to deal with that situation by copying the active portion of the register stack down in multiples of 512 bytes. Now that we mask off the lower 13 bits we don't have to do that at all. Contemporary IPF processors have a register file that can hold up to 96 stacked registers (=784 bytes [incl. 2 NaT collections]). With no indication that register files grow beyond a couple of hundred registers, we should not have to worry about it anymore... and yes, 640KB is enough for everybody :-) This change helps setcontext(2) and cpu_set_upcall_kse() in that they can return to completely different contexts without having to mess with the kernel stack. Of course exec_setregs() doesn't need to do that anymore as well.
|
#
118450 |
|
04-Aug-2003 |
marcel |
Fix logic bug in the previous commit. Any region less than 5 is a user space region. Hence, we need to test if 5 is greater than the region; not greater equal. This bug caused us to call ast() while interrupting kernel mode.
|
#
118402 |
|
03-Aug-2003 |
marcel |
Fix handling of external interrupts: we weren't calling ast() when interrupting user mode. The net effect of this bug is that a clock interrupt does not cause rescheduling and processes are not preempted. It only takes a "while (1);" to render the machine useless.
This bug was introduced by the context changes and EPC syscall code. Handling of ASTs was moved to C for clarity and ease of maintenance, but was not added for the external interrupt case.
This needs to be revisited. We now have calls to do_ast() in trap(), break_syscall() and ivt_External_Interrupt(). A single call in exception_restore covers these 3 places without duplication. This is where we handled ASTs prior to the overhaul, except that the meat has been moved to do_ast(), a C function. This was the goal to begin with.
Pointy hat: marcel
|
#
117436 |
|
11-Jul-2003 |
marcel |
Remove a gratuitous align directive after the endp directive for IVT entries.
|
#
117161 |
|
02-Jul-2003 |
ru |
The .s files were repo-copied to .S files.
Approved by: marcel Repocopied by: joe
|
#
115344 |
|
27-May-2003 |
marcel |
A flushrs must be the first in an instruction group.
Approved by: re@ (blanket)
|
#
115291 |
|
24-May-2003 |
marcel |
Unconditionally restore ar.k7 (memory stack) and ar.k6 (register stack) when returning from an interrupt. Both registers are used on interrupt to switch to the right kernel stack, but other than that they are not used. This means we only have to make sure they contain proper values while in user mode. As such, we conditionally restored these registers based on whether we returned to userland or not. A nice property of conditionally restoring ar.k6 and ar.k7 is that it introduces two invariants: ar.k6 always points to the bottom of the kernel stack and ar.k7 always points to the top of the kernel stack (immediately below the PCB we have there).
However, the EPC syscall path introduces an irregularity: there's no "thin red line" between user and kernel. There's a grey area that's a couple of instructions wide. Any interruption in that grey area is bound to see an inconsistent state. One such state is that we're in kernel space for all practical purposes, but we still need to have ar.k6 and ar.k7 restored as if we're in userland.
Thus: restore ar.k6 and ar.k7 unconditionally at the cost of losing a valuable invariant. Both registers now hold the extend of the usable portion of the kernel stack at any interrupt nesting, which when in userland mean the bottom and the top of the kstack.
|
#
115274 |
|
23-May-2003 |
marcel |
Fix a (new) source of instability:
When interrupting a kernel context, we don't need to switch stacks (memory nor register). As such, we were also not restoring the register stack pointer (ar.bspstore). This, however, fails to be valid in 1 situation: when we interrupt a register stack switch as is being done in restorectx(). The problem is that restorectx() needs to have ar.bsp == ar.bspstore before it can assign the new value to ar.bspstore. This is achieved by doing a loadrs prior to assigning to ar.bspstore. If we take an interrupt in between the loadrs and the assignment and we don't make sure we restore the ar.bspstore prior to returning from the interrupt, we switch stacks with possibly non-zero dirty registers, which means that the new frame pointer (ar.bsp) will be invalid.
So, instead of jumping over the restoration of the register frame pointer and related registers, we conditionalize it based on whether we return to kernel context or user context. A future performance tweak is possible by only restoring ar.bspstore when returning to kernel mode *and* when the RSE is in enforced lazy mode. One cannot assume ar.bsp == ar.bspstore if the RSE is not in enforced lazy mode anyway.
While here (well, not quite) don't unconditionally assign to ar.bspstore in exception_save. Only do that when we actually switch stacks. It can only harm us to do it unconditionally.
Approved by: re@ (blanket)
|
#
115179 |
|
20-May-2003 |
marcel |
o Fix a definite bogon: the dirty bity fault, instruction access failt and data access fault install the PTE in question into the VHPT table. However, a post-increment was missing and we wrote the raw PTE data into the pagesize/access key field. This leaves a corrupt VHPT entry. o While here, remove the explicit cache purge. Insertion into the translation implicitly purges any overlapping entries. o Make sure there's a cycle break between the itc and the rfi. o Whitespace fixes.
|
#
115084 |
|
16-May-2003 |
marcel |
Revamp of the syscall path, exception and context handling. The prime objectives are: o Implement a syscall path based on the epc inststruction (see sys/ia64/ia64/syscall.s). o Revisit the places were we need to save and restore registers and define those contexts in terms of the register sets (see sys/ia64/include/_regset.h).
Secundairy objectives: o Remove the requirement to use contigmalloc for kernel stacks. o Better handling of the high FP registers for SMP systems. o Switch to the new cpu_switch() and cpu_throw() semantics. o Add a good unwinder to reconstruct contexts for the rare cases we need to (see sys/contrib/ia64/libuwx)
Many files are affected by this change. Functionally it boils down to: o The EPC syscall doesn't preserve registers it does not need to preserve and places the arguments differently on the stack. This affects libc and truss. o The address of the kernel page directory (kptdir) had to be unstaticized for use by the nested TLB fault handler. The name has been changed to ia64_kptdir to avoid conflicts. The renaming affects libkvm. o The trapframe only contains the special registers and the scratch registers. For syscalls using the EPC syscall path no scratch registers are saved. This affects all places where the trapframe is accessed. Most notably the unaligned access handler, the signal delivery code and the debugger. o Context switching only partly saves the special registers and the preserved registers. This affects cpu_switch() and triggered the move to the new semantics, which additionally affects cpu_throw(). o The high FP registers are either in the PCB or on some CPU. context switching for them is done lazily. This affects trap(). o The mcontext has room for all registers, but not all of them have to be defined in all cases. This mostly affects signal delivery code now. The *context syscalls are as of yet still unimplemented.
Many details went into the removal of the requirement to use contigmalloc for kernel stacks. The details are mostly CPU specific and limited to exception_save() and exception_restore(). The few places where we create, destroy or switch stacks were mostly simplified by not having to construct physical addresses and additionally saving the virtual addresses for later use.
Besides more efficient context saving and restoring, which of course yields a noticable speedup, this also fixes the dreaded SMP bootup problem as a side-effect. The details of which are still not fully understood.
This change includes all the necessary backward compatibility code to have it handle older userland binaries that use the break instruction for syscalls. Support for break-based syscalls has been pessimized in favor of a clean implementation. Due to the overall better performance of the kernel, this will still be notived as an improvement if it's noticed at all.
Approved by: re@ (jhb)
|
#
113181 |
|
06-Apr-2003 |
marcel |
Remove the 32KB VHPT section from the kernel image. We don't really use it because we allocate a VHPT based on the size of the physical memory and even if the allocated VHPT is 32KB, we don't use the in- image section for it. Since the VHPT must be naturally aligned, we save 48K on average (due to alignment). Consequently, we start off with the VHPT disabled (it is assumed the VHPT is disabled because the EFI loader runs without memory address translation and thus has no need to setup the VHPT). It's probably a good idea to explicitly disable the VHPT if we make the use of the VHPT optional.
|
#
113160 |
|
06-Apr-2003 |
marcel |
Also set the access bit in the PTE when we get a data dirty bit fault. This avoids an immediate access bit fault when we serviced the dirty bit fault in case the access bit is unset. This typically happens for newly allocated memory that's being zeroed and thus very common.
|
#
111036 |
|
17-Feb-2003 |
julian |
Fix missed patch in last commit
|
#
111032 |
|
17-Feb-2003 |
julian |
Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind.
Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@
|
#
106197 |
|
30-Oct-2002 |
marcel |
Don't pass the return address to exception_save in register b0. Use a true scratch register. This change and future re-allocations will eventually result in code that we can unwind to to get the preserved registers of the process. This of course means that we cannot trash them while saving the process context.
While re-allocating, remove the register aliases. Abstraction is in this case disadvanteous.
|
#
105499 |
|
20-Oct-2002 |
marcel |
Define IVT_ENTRY and IVT_END as special versions of ENTRY and END for defining vectors. As a result, each vector will be a global function with unwind directives to notify the unwinder that we're in an interrupt handler. In the debugger this will show up something like:
Debugger(0xe000000000a211d8, 0xe000000000748960) at Debugger+0x31 panic(0xe000000000a36858, 0xe0000000021d32d0, 0xe000000000ae42e8, ... trap(0x14, 0x100000, 0xe0000000021d32d0, 0x0, 0xa0000000002095f0, ... ivt_Data_TLB(0x14, 0x100000, 0xe0000000021d32d0) at ivt_Data_TLB+0x1f0
|
#
105010 |
|
12-Oct-2002 |
marcel |
Plug two holes where we returned to userland without restoring the predicate registers. Even though the ITLB and DTLB interrupts happen often enough, this bug didn't do much harm. The reason is that the interrupt handlers only modify p1 and since this is a preserved (callee-saved) register it is hardly used in code generated by the compiler. Compilers use scratch registers by default. Changing the interrupt handlers to use p6 (ie a scratch register) proved that the bug was in fact fatal.
|
#
95768 |
|
30-Apr-2002 |
marcel |
Add ar.lc and ar.ec to the trapframe. These are not saved for syscalls, only for exceptions.
While adding this to exception_save and exception_restore, it was hard to find a good place to put the instructions. The code sequence was sufficiently arbitrarily ordered that the density was low (roughly 67%). No explicit bundling was used. Thus, I rewrote the functions to optimize for density (close to 80% now), and added explicit bundles and nop instructions. The immediate operand on the nop instruction has been incremented with each instance, to make debugging a bit easier when looking at recurring patterns. Redundant stops have been removed as much as possible. Future optimizations can focus more on performance. A well-placed lfetch can make all the difference here!
Also, the FRAME_Fxx defines in frame.h were mostly bogus. FRAME_F10 to FRAME_F15 were copied from FRAME_F9 and still had the same index. We don't use them yet, so nothing was broken.
|
#
94365 |
|
10-Apr-2002 |
dfr |
Call ast() from the syscall exit path as well as for full exception restores.
|
#
93389 |
|
29-Mar-2002 |
jake |
Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode.
Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64
|
#
92249 |
|
13-Mar-2002 |
dfr |
Don't restore r13 when returning to kernel mode. We may have migrated to a different cpu since the exception_save and r13 needs to point at the current cpu's pcpu structure.
|
#
88685 |
|
30-Dec-2001 |
marcel |
Add missing predicate in interruption_Data_TLB. Without this predicate we never used the VHPT entry we found.
While here, normalize the compares.
|
#
87702 |
|
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
#
86951 |
|
27-Nov-2001 |
dfr |
Minor tweaks to the TLB handling code - avoid movl instructions and add itc.x instructions to attempt to avoid the little flurry of TLB exceptions for handling access, dirty etc.
|
#
86290 |
|
12-Nov-2001 |
marcel |
Invoke trap() for the alt. ITLB and alt. DTLB interrrupts when the region is not 6 or 7. This changes the behaviour from inserting a bogus region 6 mapping to a kernel panic.
|
#
86239 |
|
10-Nov-2001 |
marcel |
Avoid using the .align directive to skip to the next vector offset. It doesn't help us catch overflowing vector entries at compile time. Instead use the .org directive. The last entry in the IVT doesn't strictly need to be limited to 256 bytes, but doing so allows the the VHPT to be placed immediately following the IVT without wasting any space due to alignment.
|
#
85866 |
|
02-Nov-2001 |
dfr |
Call ast() from exception_restore when we are restoring to user mode.
|
#
85785 |
|
31-Oct-2001 |
dfr |
Experiment with rewriting the syscall() wrapper using explicit bundling and trying to reduce stalls from reading certain high latency registers. This should be faster than the old syscall code. Its certainly a lot smaller.
|
#
85682 |
|
29-Oct-2001 |
dfr |
Various fixes to make stack traces using the unwind tables work properly.
|
#
85285 |
|
21-Oct-2001 |
dfr |
We need to save a bit more information in the partial syscall trapframe in case we need to take a signal.
|
#
84632 |
|
07-Oct-2001 |
dfr |
* Use srlz.i to serialise changes to psr.ic * Don't enable psr.i at the same time as psr.dt and psr.ic
These changes improve stability considerably.
|
#
84556 |
|
05-Oct-2001 |
dfr |
Fix some dependency violations (don't know why gas didn't catch this).
|
#
84115 |
|
29-Sep-2001 |
dfr |
Use PAGE_SHIFT instead of a hardcoded constant for log2(PAGE_SIZE).
|
#
83912 |
|
24-Sep-2001 |
dfr |
Make the Alternate {I,D} TLB vector code actually work for virtual addresses greater than 256M (the page size for region 6 and 7).
|
#
83906 |
|
24-Sep-2001 |
dfr |
Include <machine/pte.h> instead of <machine/pmap.h>
|
#
83833 |
|
22-Sep-2001 |
dfr |
Remove a redundant stop.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
75912 |
|
24-Apr-2001 |
dfr |
Don't trash the user's pr on syscalls.
|
#
75665 |
|
18-Apr-2001 |
dfr |
Record the right value for tf_ndirty for kernel interruptions so that we can examine the interrupted register stack frame in DDB.
|
#
70210 |
|
20-Dec-2000 |
marcel |
Resolve RAW dependency violation between tbit and adds.
|
#
67522 |
|
24-Oct-2000 |
dfr |
* Various fixes to breakage introduced by the atomic and mutex reorgs. * Fixes to the signal delivery code. Not quite right yet.
I would have preferred to wait until I have signal delivery actually working but the current kernel in CVS doesn't build.
|
#
67323 |
|
19-Oct-2000 |
dfr |
* Disable interrupts when restoring a trapframe. * Make sure we reset ar.k6 (used to hold the kernel stack pointer when we are returning to user mode after a syscall.
|
#
67212 |
|
16-Oct-2000 |
dfr |
Do a full exception_restore after an execve syscall to ensure that the new program gets the right values for its arguments etc.
|
#
67199 |
|
16-Oct-2000 |
dfr |
* Correct some of my misunderstandings about how best to switch to the kernel backing store. * Implement syscalls via break instructions. * Fix backing store copying in cpu_fork() so that the child gets the right register values.
This thing is actually starting to work now. This set of changes takes me up to the second execve (the one which runs the first shell). Next stop single-user mode :-).
|
#
67032 |
|
12-Oct-2000 |
dfr |
Implement a rudimentary interrupt handling system which should be good enough for clock interrupts in SKI.
|
#
67020 |
|
12-Oct-2000 |
dfr |
* Fix exception handling so that it actually works. We can now handle exceptions from both kernel and user mode. * Fix context switching so that we can switch back to a proc which we switched away from (we were saving the state in the wrong place). * Implement lazy switching of the high-fp state. This needs to be looked at again for SMP to cope with the case of a process migrating from one processor to another while it has the high-fp state. * Make setregs() work properly. I still think this should be called cpu_exec() or something. * Various other minor fixes.
With this lot, we can execve() /sbin/init and we get all the way up to its first syscall. At that point, we stop because syscall handling is not done yet.
|
#
66633 |
|
04-Oct-2000 |
dfr |
Next round of fixes to the ia64 code. This includes simulated clock and disk drivers along with a load of fixes to context switching, fork handling and a load of other stuff I can't remember now. This takes us as far as start_init() before it dies. I guess now I will have to finish off the VM system and syscall handling :-).
|
#
66486 |
|
30-Sep-2000 |
dfr |
Next round of ia64 work, including fixes to context switching, implementing cpu_fork(), copy*str(), bcopy(), copy{in,out}(). With these changes, my test kernel reaches the mountroot prompt.
|
#
66463 |
|
29-Sep-2000 |
dfr |
Implement dirty and access bit exceptions.
|
#
66460 |
|
29-Sep-2000 |
dfr |
Use write-back instead of write-combining for region 7.
|
#
66458 |
|
29-Sep-2000 |
dfr |
This is the first snapshot of the FreeBSD/ia64 kernel. This kernel will not work on any real hardware (or fully work on any simulator). Much more needs to happen before this is actually functional but its nice to see the FreeBSD copyright message appear in the ia64 simulator.
|