#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
255289 |
|
06-Sep-2013 |
glebius |
On those machines, where sf_bufs do not represent any real object, make sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely empty functions.
Reviewed by: alc, kib, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
231177 |
|
08-Feb-2012 |
marcel |
Rev. 228360 moved the call to cpu_set_upcall() to happen before td_proc gets initialized in td (=newtd). Use td0 instead.
MFC after: 3 days
|
#
209026 |
|
11-Jun-2010 |
marcel |
Bump MAX_BPAGES from 256 to 1024. It seems that a few drivers, bge(4) in particular, do not handle deferred DMA map load operations at all. Any error, and especially EINPROGRESS, is treated as a hard error and typically abort the current operation. The fact that the busdma code queues the load operation for when resources (i.e. bounce buffers in this particular case) are available makes this especially problematic. Bounce buffering, unlike what the PR synopsis would suggest, works fine.
While on the subject, properly implement swi_vm().
PR: 147502 MFC after: 1 week
|
#
204905 |
|
09-Mar-2010 |
marcel |
Remove inclusion of <i386/include/psl.h> While here move inclusion of <sys/lock.h> in a better place.
|
#
199135 |
|
10-Nov-2009 |
kib |
Extract the code that records syscall results in the frame into MD function cpu_set_syscall_retval().
Suggested by: marcel Reviewed by: marcel, davidxu PowerPC, ARM, ia64 changes: marcel Sparc64 tested and reviewed by: marius, also sunv reviewed MIPS tested by: gonzo MFC after: 1 month
|
#
198733 |
|
31-Oct-2009 |
marcel |
Reimplement the lazy FP context switching: o Move all code into a single file for easier maintenance. o Use a single global lock to avoid having to handle either multiple locks or race conditions. o Make sure to disable the high FP registers after saving or dropping them. o use msleep() to wait for the other CPU to save the high FP registers.
This change fixes the high FP inconsistency panics.
A single global lock typically serializes too much, which may be noticable when a lot of threads use the high FP registers, but in that case it's probably better to switch the high FP context synchronuously. Put differently: cpu_switch() should switch the high FP registers if the incoming and outgoing threads both use the high FP registers.
|
#
194524 |
|
20-Jun-2009 |
marcel |
Drop the high FP state of an exiting thread in cpu_thread_exit() and not in cpu_exit(). The latter is called after td_md.md_highfp_mtx has been destroyed, which results in a race condition when another thread wants to use the high FP registers on the CPU that still has the high FP registers in question.
|
#
173615 |
|
14-Nov-2007 |
marcel |
o Rename cpu_thread_setup() to cpu_thread_alloc() to better communicate that it relates to (is called by) thread_alloc() o Add cpu_thread_free() which is called from thread_free() to counter-act cpu_thread_alloc().
i386: Have cpu_thread_free() call cpu_thread_clean() to preserve behaviour. ia64: Have cpu_thread_free() call mtx_destroy() for the mutex initialized in cpu_thread_alloc().
PR: ia64/118024
|
#
170305 |
|
04-Jun-2007 |
jeff |
- Change comments and asserts to reflect the removal of the global scheduler lock.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
#
158651 |
|
16-May-2006 |
phk |
Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.
|
#
149915 |
|
09-Sep-2005 |
marcel |
Change the High FP lock from a sleep lock to a spin lock. We can take the lock from interrupt context, which causes an implicit lock order reversal. We've been using the lock carefully enough that making it a spin lock should not be harmful.
|
#
148807 |
|
06-Aug-2005 |
marcel |
Improve SMP support: o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU uses to look up translations it can't find in the TLB. As such, the VHPT serves as a level 1 cache (the TLB being a level 0 cache) and best results are obtained when it's not shared between CPUs. The collision chain (i.e. the hash bucket) is shared between CPUs, as all buckets together constitute our collection of PTEs. To achieve this, the collision chain does not point to the first PTE in the list anymore, but to a hash bucket head structure. The head structure contains the pointer to the first PTE in the list, as well as a mutex to lock the bucket. Thus, each bucket is locked independently of each other. With at least 1024 buckets in the VHPT, this provides for sufficiently finei-grained locking to make the ssolution scalable to large SMP machines. o Add synchronisation to the lazy FP context switching. We do this with a seperate per-thread lock. On SMP machines the lazy high FP context switching without synchronisation caused inconsistent state, which resulted in a panic. Since the use of the high FP registers is not common, it's possible that races exist. The ia64 package build has proven to be a good stress test, so this will get plenty of exercise in the near future. o Don't use the local ID of the processor we want to send the IPI to as the argument to ipi_send(). use the struct pcpu pointer instead. The reason for this is that IPI delivery is unreliable. It has been observed that sending an IPI to a CPU causes it to receive a stray external interrupt. As such, we need a way to make the delivery reliable. The intended solution is to queue requests in the target CPU's per-CPU structure and use a single IPI to inform the CPU that there's a new entry in the queue. If that IPI gets lost, the CPU can check it's queue at any convenient time (such as for each clock interrupt). This also allows us to send requests to a CPU without interrupting it, if such would be beneficial.
With these changes SMP is almost working. There are still some random process crashes and the machine can hang due to having the IPI lost that deals with the high FP context switch.
The overhead of introducing the hash bucket head structure results in a performance degradation of about 1% for UP (extra pointer indirection). This is surprisingly small and is offset by gaining reasonably/good scalable SMP support.
|
#
147889 |
|
10-Jul-2005 |
davidxu |
Validate if the value written into {FS,GS}.base is a canonical address, writting non-canonical address can cause kernel a panic, by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring only canonical values get written to the registers.
Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com > Approved by: re (scottl)
|
#
145433 |
|
23-Apr-2005 |
davidxu |
Change cpu_set_kse_upcall to more generic style, so we can reuse it in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.
|
#
144637 |
|
04-Apr-2005 |
jhb |
Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case.
Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch.
This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example).
Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more
|
#
140256 |
|
14-Jan-2005 |
jhb |
- Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer the last action of kern_exit(). Instead, it is a MD callout to cleanup per-process state during exit. - Add notes of concern to Alpha and ia64 about the possible need to drop fp state in cpu_thread_exit() rather than in cpu_exit() since it is per-thread state rather than per-process.
|
#
139790 |
|
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
#
138129 |
|
27-Nov-2004 |
das |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
#
129750 |
|
26-May-2004 |
tmm |
Retire cpu_sched_exit(); it is not used any more.
|
#
128393 |
|
18-Apr-2004 |
alc |
MFamd64 Simplify the sf_buf implementation. In short, make it a veneer over the direct virtual-to-physical mapping.
|
#
127788 |
|
03-Apr-2004 |
alc |
In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it should not. Add a new parameter so that the caller can specify which is the case.
Reported by: dillon
|
#
127494 |
|
27-Mar-2004 |
marcel |
MFi386: correctly calculate the top-of-stack when a kthread is created with a larger kernel stack. Remove inclusion of opt_kstack_pages.h now that it's unused.
|
#
127086 |
|
16-Mar-2004 |
alc |
Refactor the existing machine-dependent sf_buf_free() into a machine- dependent function by the same name and a machine-independent function, sf_buf_mext(). Aside from the virtue of making more of the code machine- independent, this change also makes the interface more logical. Before, sf_buf_free() did more than simply undo an sf_buf_alloc(); it also unwired and if necessary freed the page. That is now the purpose of sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used as a general-purpose emphemeral map cache.
|
#
123929 |
|
28-Dec-2003 |
silby |
Track three new sendfile-related statistics: - The number of times sendfile had to do disk I/O - The number of times sfbuf allocation failed - The number of times sfbuf allocation had to wait
|
#
123920 |
|
27-Dec-2003 |
silby |
Move the declaration of sfbufspeak and sfbufsused to mbuf.h, and use imax instead of max, as sfbufspeak and sfbufsused are signed.
Submitted by: bde
|
#
123884 |
|
27-Dec-2003 |
silby |
Track current and peak sfbuf usage, export the values via sysctl.
|
#
122821 |
|
16-Nov-2003 |
alc |
- Remove unnecessary synchronization from sf_buf_init(). (There is only one active CPU when sf_buf_init() is performed.)
|
#
122780 |
|
16-Nov-2003 |
alc |
- Modify alpha's sf_buf implementation to use the direct virtual-to- physical mapping. - Move the sf_buf API to its own header file; make struct sf_buf's definition machine dependent. In this commit, we remove an unnecessary field from struct sf_buf on the alpha, amd64, and ia64. Ultimately, we may eliminate struct sf_buf on those architecures except as an opaque pointer that references a vm page.
|
#
122373 |
|
09-Nov-2003 |
marcel |
When a thread is being swapped-out, save the high FP registers. We have a pointer in the PCPU to the PCB of the thread that currently has its high FP registers loaded.
|
#
121635 |
|
28-Oct-2003 |
marcel |
When switching the RSE to use the kernel stack as backing store, keep the RNAT bit index constant. The net effect of this is that there's no discontinuity WRT NaT collections which greatly simplifies certain operations. The cost of this is that there can be up to 504 bytes of unused stack between the true base of the kernel stack and the start of the RSE backing store. The cost of adjusting the backing store pointer to keep the RNAT bit index constant, for each kernel entry, is negligible.
The primary reasons for this change are: 1. Asynchronuous contexts in KSE processes have the disadvantage of having to copy the dirty registers from the kernel stack onto the user stack. The implementation we had so far copied the registers one at a time without calculating NaT collection values. A process that used speculation would not work. Now that the RNAT bit index is constant, we can block-copy the registers from the kernel stack to the user stack without having to worry about NaT collections. They will be in the right place on the user stack. 2. The ndirty field in the trapframe is now also usable in userland. This was previously not the case because ndirty also includes the space occupied by NaT collections. The value could be off by 8, depending on the discontinuity. Now that the RNAT bit index is contants, we have exactly the same number of NaT collection points on the kernel stack as we would have had on the user stack if we didn't switch backing stores. 3. Debuggers and other applications that use ptrace(2) can now copy the dirty registers from the kernel stack (using ptrace(2)) and copy them whereever they want them (onto the user stack of the inferior as might be the case for gdb) without having to worry about NaT collections in the same way the kernel doesn't have to worry about them.
There's a second order effect caused by the randomization of the base of the backing store, for it depends on the number of dirty registers the processor happened to have at the time of entry into the kernel. The second order effect is that the RSE will have a better cache utilization as compared to having the backing store always aligned at page boundaries. This has not been measured and may be in practice only minimally beneficial, if at all measurable.
|
#
120683 |
|
03-Oct-2003 |
marcel |
Swap the syscall caller frame info (i.e. the return pointer and frame marker) and the syscall stub frame info in the trap frame. Previously we stored the stub frame info in (rp,pfs) and the caller frame info in (iip,cfm). This ends up being suboptimal for the following reasons: 1. When we create a new context, such as for an execve(2), we had to set the (rp,pfs) pair for the entry point when using the syscall path out of the kernel but we need to set the (iip,cfm) pair when we take the interrupt way out. This is mostly just an inconsistency from the kernel's point of view, but an ugly irregularity from gdb(1)'s point of view. 2. The getcontext(2) and setcontext(2) syscalls had to swap the (rp,pfs) and (iip,cfm) pairs to make the context compatible with one created purely in userland.
Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal handlers that actually peek at the mcontext_t and to gdb(1). Since this change is made for gdb(1) and we don't care about signal handlers that peek at the mcontext_t because we're still a tier 2 platform, this ABI breakage is academic at this moment in time.
Note that there was no real reason to save the caller frame info in (iip,cfm) and the stub frame info in (rp,pfs).
|
#
119624 |
|
31-Aug-2003 |
marcel |
Use direct mapped KVA for the sf_buf allocator, as made possible by the previous commit. While here, fix a typo, reformat comments and fix a long line.
Tested with: ftpd
|
#
119563 |
|
29-Aug-2003 |
alc |
Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy sockets into machine-dependent files. The rationale for this migration is illustrated by the modified amd64 allocator. It uses the amd64's direct map to avoid emphemeral mappings in the kernel's address space. On an SMP, the emphemeral mappings result in an IPI for TLB shootdown for each transmitted page. Yuck.
Maintainers of other 64-bit platforms with direct maps should be able to use the amd64 allocator as a reference implementation.
|
#
119004 |
|
16-Aug-2003 |
marcel |
In vm_thread_swap{in|out}(), remove the alpha specific conditional compilation and replace it with a call to cpu_thread_swap{in|out}(). This allows us to add similar code on ia64 without cluttering the code even more.
|
#
118990 |
|
16-Aug-2003 |
marcel |
Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to cpu.h. This affects db_command.c and kern_shutdown.c.
ia64: move all MD prototypes from cpu.h to md_var.h. This affects madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory(). It's not used (vm_machdep.c).
alpha: the MD prototypes have been left in cpu.h with a comment that they should be there. Moving them is left for later. It was expected that the impact would be significant enough to be done in a seperate commit.
powerpc: MD prototypes left in cpu.h. Comment added.
Suggested by: bde Tested with: make universe (pc98 incomplete)
|
#
118739 |
|
10-Aug-2003 |
marcel |
o move cpu_reset() from vm_machdep.c to machdep.c. o reorder cpu_boot(), cpu_halt() and identifycpu().
No functional change.
|
#
118588 |
|
07-Aug-2003 |
marcel |
o Fix cut-n-paste whitespace corruption in previous commit o For trap-based upcalls the argument (the kse_mailbox) to the UTS must be written onto the kernel stack, not the user stack. While here, deal with the fact that we may be at a NaT collection point.
|
#
118566 |
|
06-Aug-2003 |
marcel |
In cpu_set_upcall_kse(), create the upcall according to the entry path into the kernel. Normally it's due to a syscall, but one can also be created as the result of a clock interrupt (for example). This now even more looks like exec_setregs().
While here, add an assert that we don't expect more than 8KB of dirty registers on the kernel stack.
|
#
118239 |
|
30-Jul-2003 |
peter |
Deal with 'options KSTACK_PAGES' being a global option.
|
#
116971 |
|
28-Jun-2003 |
marcel |
Implement cpu_set_upcall_kse(). Elementary testing shows that this function behaves correctly in principle, but is not expected to be 100% complete. In any case, with this commit we have KSE ported enough to start runtime testing with threaded applications and fix whatever bugs or omissions we encounter. Yay!
|
#
116188 |
|
11-Jun-2003 |
peter |
GC unused cpu_wait() function
|
#
115858 |
|
04-Jun-2003 |
marcel |
Change the second (and last) argument of cpu_set_upcall(). Previously we were passing in a void* representing the PCB of the parent thread. Now we pass a pointer to the parent thread itself. The prime reason for this change is to allow cpu_set_upcall() to copy (parts of) the trapframe instead of having it done in MI code in each caller of cpu_set_upcall(). Copying the trapframe cannot always be done with a simply bcopy() or may not always be optimal that way. On ia64 specifically the trapframe contains information that is specific to an entry into the kernel and can only be used by the corresponding exit from the kernel. A trapframe copied verbatim from another frame is in most cases useless without some additional normalization.
Note that this change removes the assignment to td->td_frame in some implementations of cpu_set_upcall(). The assignment is redundant. A previous call to cpu_thread_setup() already did the exact same assignment. An added benefit of removing the redundant assignment is that we can now change td_pcb without nasty side-effects.
This change officially marks the ability on ia64 for 1:1 threading.
Not tested on: amd64, powerpc Compile & boot tested on: alpha, sparc64 Functionally tested on: i386, ia64
|
#
115651 |
|
01-Jun-2003 |
marcel |
Improve on cpu_set_upcall: o Use pcb and tf for the new pcb and the new trapframe and use pcb0 for the old (current) pcb. The mix of pcb, pcb2 and tf was slightly confusing. o Don't define td->td_frame here. It has already been set previously by cpu_thread_setup. Add a KASSERT to make sure pcb and tf are both non-NULL. o Make sure the number of dirty registers is 0 for the new thread. There are no user registers on the backing store because we heven't enter userland yet.
|
#
115605 |
|
01-Jun-2003 |
marcel |
Implement cpu_thread_setup(). This is mostly the same as on i386, except for the fact that trapframes have a size recorded in it that we set here too. We need this for proper thread setup.
Pointed out by: mtm
|
#
115570 |
|
31-May-2003 |
marcel |
Implement cpu_set_upcall(). Required by libthr and used by thr_create(2). This implementation is so far only compile tested. But since this is also the last of the functions required to support libthr, we're now functionally complete (for some weird definition of functionally; and complete). Runtime testing can commence.
|
#
115084 |
|
16-May-2003 |
marcel |
Revamp of the syscall path, exception and context handling. The prime objectives are: o Implement a syscall path based on the epc inststruction (see sys/ia64/ia64/syscall.s). o Revisit the places were we need to save and restore registers and define those contexts in terms of the register sets (see sys/ia64/include/_regset.h).
Secundairy objectives: o Remove the requirement to use contigmalloc for kernel stacks. o Better handling of the high FP registers for SMP systems. o Switch to the new cpu_switch() and cpu_throw() semantics. o Add a good unwinder to reconstruct contexts for the rare cases we need to (see sys/contrib/ia64/libuwx)
Many files are affected by this change. Functionally it boils down to: o The EPC syscall doesn't preserve registers it does not need to preserve and places the arguments differently on the stack. This affects libc and truss. o The address of the kernel page directory (kptdir) had to be unstaticized for use by the nested TLB fault handler. The name has been changed to ia64_kptdir to avoid conflicts. The renaming affects libkvm. o The trapframe only contains the special registers and the scratch registers. For syscalls using the EPC syscall path no scratch registers are saved. This affects all places where the trapframe is accessed. Most notably the unaligned access handler, the signal delivery code and the debugger. o Context switching only partly saves the special registers and the preserved registers. This affects cpu_switch() and triggered the move to the new semantics, which additionally affects cpu_throw(). o The high FP registers are either in the PCB or on some CPU. context switching for them is done lazily. This affects trap(). o The mcontext has room for all registers, but not all of them have to be defined in all cases. This mostly affects signal delivery code now. The *context syscalls are as of yet still unimplemented.
Many details went into the removal of the requirement to use contigmalloc for kernel stacks. The details are mostly CPU specific and limited to exception_save() and exception_restore(). The few places where we create, destroy or switch stacks were mostly simplified by not having to construct physical addresses and additionally saving the virtual addresses for later use.
Besides more efficient context saving and restoring, which of course yields a noticable speedup, this also fixes the dreaded SMP bootup problem as a side-effect. The details of which are still not fully understood.
This change includes all the necessary backward compatibility code to have it handle older userland binaries that use the break instruction for syscalls. Support for break-based syscalls has been pessimized in favor of a clean implementation. Due to the overall better performance of the kernel, this will still be notived as an improvement if it's noticed at all.
Approved by: re@ (jhb)
|
#
111028 |
|
17-Feb-2003 |
jeff |
- Split the struct kse into struct upcall and struct kse. struct kse will soon be visible only to schedulers. This greatly simplifies much the KSE code.
Submitted by: davidxu
|
#
110190 |
|
01-Feb-2003 |
julian |
Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but them's the rules..
I am using my "David's mentor" hat to revert this as he's offline for a while.
|
#
109877 |
|
26-Jan-2003 |
davidxu |
Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
|
#
109342 |
|
15-Jan-2003 |
dillon |
Merge all the various copies of vm_fault_quick() into a single portable copy.
|
#
109340 |
|
15-Jan-2003 |
dillon |
Merge all the various copies of vmapbuf() and vunmapbuf() into a single portable copy. Note that pmap_extract() must be used instead of pmap_kextract().
This is precursor work to a reorganization of vmapbuf() to close remaining user/kernel races (which can lead to a panic).
|
#
107719 |
|
10-Dec-2002 |
julian |
Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled.
Approved by: re (jhb)
|
#
107481 |
|
01-Dec-2002 |
alc |
MFi386 Hold the page queues lock around vm_page_unhold() in vunmapbuf().
Approved by: re (blanket)
|
#
107180 |
|
22-Nov-2002 |
mux |
Under certain circumstances, we were calling kmem_free() from i386 cpu_thread_exit(). This resulted in a panic with WITNESS since we need to hold Giant to call kmem_free(), and we weren't helding it anymore in cpu_thread_exit(). We now do this from a new MD function, cpu_thread_dtor(), called by thread_dtor().
Approved by: re@ Suggested by: jhb
|
#
106189 |
|
30-Oct-2002 |
marcel |
Rewrite cpu_switch(). The most notable change is the fact that we now have f16-f31 as part of the context. The PCB has been reorganized to better match how we save and restore the (preserved) registers. This commit also moves the context restoriation to its own function (named pcb_restore), as we did with pcb_save.
Only minimal effort has been put in writing optimal assembly. The expectation is that there will be more rounds of changes.
|
#
104426 |
|
03-Oct-2002 |
peter |
Update stubs for post-kseIII.
|
#
103049 |
|
06-Sep-2002 |
peter |
Zap the implementations of the i386-aout specific cpu_coredump function. Most of the non-i386 platforms had rather broken implementations anyway.
|
#
102161 |
|
20-Aug-2002 |
rwatson |
Correct one more errant whitespace nit that crept in during changes in the arguments to vn_rdwr(). Hopefully the last.
|
#
101945 |
|
15-Aug-2002 |
rwatson |
Correct a minor whitespace nit that sneaked in with my previous commit.
|
#
101941 |
|
15-Aug-2002 |
rwatson |
In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what:
- Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c.
For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics:
- badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred
Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics.
Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED.
These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations.
Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
99081 |
|
29-Jun-2002 |
julian |
Add KSE stubs to MD parts of ia64 code. Dfr will fill these out when we decide to enable KSEs on ia64 (probably not immediatly)
|
#
98765 |
|
24-Jun-2002 |
jake |
Add an MD callout like cpu_exit, but which is called after sched_lock is obtained, when all other scheduling activity is suspended. This is needed on sparc64 to deactivate the vmspace of the exiting process on all cpus. Otherwise if another unrelated process gets the exact same vmspace structure allocated to it (same address), its address space will not be activated properly. This seems to fix some spontaneous signal 11 problems with smp on sparc64.
|
#
96059 |
|
05-May-2002 |
marcel |
Remove definition of struct ia64_fdesc. It's been moved to md_var.h
|
#
95202 |
|
21-Apr-2002 |
dfr |
Setup the child's return values correctly when forking an IA-32 process.
|
#
93761 |
|
04-Apr-2002 |
alc |
o Kill the MD grow_stack(). Call the MI vm_map_growstack() in its place. o Eliminate the use of useracc() and grow_stack() from sendsig().
Reviewed by: peter
|
#
92843 |
|
20-Mar-2002 |
alfred |
Remove __P.
Reviewd by: peter
|
#
92286 |
|
14-Mar-2002 |
dfr |
* Initialise pcb_pmap for new threads. * Add support for forking new threads from &thread0 as well as curthread.
|
#
92020 |
|
10-Mar-2002 |
dfr |
Add an implementation of cpu_throw() and make restorectx() simply branch to the tail of cpu_switch.
|
#
90361 |
|
07-Feb-2002 |
julian |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
#
88727 |
|
30-Dec-2001 |
marcel |
Revert previous definition of cpu_throw(). Non-MP configurations were broken as well.
|
#
88689 |
|
30-Dec-2001 |
marcel |
o Remove temporary implementation of cpu_throw in vm_machdep.c and instead make it an alternate entry-point of cpu_switch() in swtch.s o Add SMP support to cpu_switch().
|
#
84732 |
|
09-Oct-2001 |
dfr |
Clarify a comment.
Requested by: jhb
|
#
84381 |
|
02-Oct-2001 |
mjacob |
Fix problem where a user buffer outside of the area being tested will be corrupted.
PR: 29194 Obtained from: Tor.Egge@fast.no MFC after: 2 weeks
|
#
84117 |
|
29-Sep-2001 |
dfr |
Call cpu_boot from cpu_reset.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
83276 |
|
10-Sep-2001 |
peter |
Rip some well duplicated code out of cpu_wait() and cpu_exit() and move it to the MI area. KSE touched cpu_wait() which had the same change replicated five ways for each platform. Now it can just do it once. The only MD parts seemed to be dealing with fpu state cleanup and things like vm86 cleanup on x86. The rest was identical.
XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional stub in place.
Reviewed by: jake, tmm, dillon
|
#
83223 |
|
08-Sep-2001 |
peter |
Missing part of dillon's coredump commit. cpu_coredump() was still passing IO_NODELOCKED to vn_rdwr(), this would cause operations on the unlocked core vnode and softupdates nastiness if an a.out binary cored.
|
#
82938 |
|
04-Sep-2001 |
peter |
Nuke #if 0'ed "setredzone()" stub. We never used it, and probably never will. I've implemented an optional redzone as part of the KSE upage breakup.
|
#
82788 |
|
02-Sep-2001 |
peter |
Sync with i386 / alpha. Whitespace unindent / style prep for kse.
|
#
79265 |
|
04-Jul-2001 |
dillon |
Move vm_page_zero_idle() from machine-dependant sections to a machine-independant source file, vm/vm_zeroidle.c. It was exactly the same for all platforms and updating them all was getting annoying.
|
#
79263 |
|
04-Jul-2001 |
dillon |
Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc). Also removed some spl's and added some VM mutexes, but they are not actually used yet, so this commit does not really make any operational changes to the system.
vm_page.c relates to vm_page_t manipulation, including high level deactivation, activation, etc... vm_pageq.c relates to finding free pages and aquiring exclusive access to a page queue (exclusivity part not yet implemented). And the world still builds... :-)
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
79123 |
|
03-Jul-2001 |
jhb |
Allow Giant to be recursed when a process terminates.
|
#
78962 |
|
29-Jun-2001 |
jhb |
Add a new MI pointer to the process' trapframe p_frame instead of using various differently named pointers buried under p_md.
Reviewed by: jake (in principle)
|
#
77448 |
|
29-May-2001 |
jhb |
- Catch up to the VM mutex changes. - Sort includes in a few places.
|
#
75701 |
|
19-Apr-2001 |
dfr |
Don't unwrap the function descriptor used as the callout argument to fork_exit(). The MI version of fork_exit() needs a real function descriptor, not a simple function pointer.
|
#
73922 |
|
07-Mar-2001 |
jhb |
Use the proc lock to protect p_pptr when waking up our parent in cpu_exit() and remove the mpfixme() message that is now fixed.
|
#
72906 |
|
22-Feb-2001 |
jhb |
Rename switch_trampoline() to fork_trampoline() on the alpha and ia64.
Suggested by: dfr
|
#
72905 |
|
22-Feb-2001 |
jhb |
Don't set the sched_lock lesting level for new processes as it is no longer used.
|
#
72904 |
|
22-Feb-2001 |
jhb |
Catch comments up to child_return() -> fork_return() as well.
|
#
72899 |
|
22-Feb-2001 |
jhb |
Use the MI fork_return() fork trampoline callout function for child processes instead of the MD child_return().
|
#
72200 |
|
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
#
70317 |
|
23-Dec-2000 |
jake |
Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock.
linprocfs not locked pending response from informal maintainer.
Reviewed by: jhb, -smp@
|
#
68762 |
|
15-Nov-2000 |
jhb |
Don't perform an mi_switch() when we release Giant during cpu_exit(). We are about to call cpu_switch() anyways.
Found by: witness
|
#
67551 |
|
25-Oct-2000 |
jhb |
- Overhaul the software interrupt code to use interrupt threads for each type of software interrupt. Roughly, what used to be a bit in spending now maps to a swi thread. Each thread can have multiple handlers, just like a hardware interrupt thread. - Instead of using a bitmask of pending interrupts, we schedule the specific software interrupt thread to run, so spending, NSWI, and the shandlers array are no longer needed. We can now have an arbitrary number of software interrupt threads. When you register a software interrupt thread via sinthand_add(), you get back a struct intrhand that you pass to sched_swi() when you wish to schedule your swi thread to run. - Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit more intuitive. Also, prefix all the members of struct intrhand with 'ih_'. - Make swi_net() a MI function since there is now no point in it being MD.
Submitted by: cp
|
#
67210 |
|
16-Oct-2000 |
dfr |
Clear ar.pfs for the child process in cpu_fork - switch_trampoline doesn't want a stack frame.
|
#
67199 |
|
16-Oct-2000 |
dfr |
* Correct some of my misunderstandings about how best to switch to the kernel backing store. * Implement syscalls via break instructions. * Fix backing store copying in cpu_fork() so that the child gets the right register values.
This thing is actually starting to work now. This set of changes takes me up to the second execve (the one which runs the first shell). Next stop single-user mode :-).
|
#
67020 |
|
12-Oct-2000 |
dfr |
* Fix exception handling so that it actually works. We can now handle exceptions from both kernel and user mode. * Fix context switching so that we can switch back to a proc which we switched away from (we were saving the state in the wrong place). * Implement lazy switching of the high-fp state. This needs to be looked at again for SMP to cope with the case of a process migrating from one processor to another while it has the high-fp state. * Make setregs() work properly. I still think this should be called cpu_exec() or something. * Various other minor fixes.
With this lot, we can execve() /sbin/init and we get all the way up to its first syscall. At that point, we stop because syscall handling is not done yet.
|
#
66937 |
|
10-Oct-2000 |
dfr |
* Add rudimentary DDB support (no kgdb, no backtrace, no single step). * Track recent changes to SWI code. * Allocate RIDs for pmaps (untested). * Implement assembler version of cpu_switch - its cleaner that way.
|
#
66633 |
|
04-Oct-2000 |
dfr |
Next round of fixes to the ia64 code. This includes simulated clock and disk drivers along with a load of fixes to context switching, fork handling and a load of other stuff I can't remember now. This takes us as far as start_init() before it dies. I guess now I will have to finish off the VM system and syscall handling :-).
|
#
66486 |
|
30-Sep-2000 |
dfr |
Next round of ia64 work, including fixes to context switching, implementing cpu_fork(), copy*str(), bcopy(), copy{in,out}(). With these changes, my test kernel reaches the mountroot prompt.
|
#
66458 |
|
29-Sep-2000 |
dfr |
This is the first snapshot of the FreeBSD/ia64 kernel. This kernel will not work on any real hardware (or fully work on any simulator). Much more needs to happen before this is actually functional but its nice to see the FreeBSD copyright message appear in the ia64 simulator.
|