#
272461 |
|
02-Oct-2014 |
gjb |
Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
271999 |
|
22-Sep-2014 |
jhb |
MFC 270850,271053,271192,271717: Save and restore FPU state across suspend and resume on i386. - Create a separate structure for per-CPU state saved across suspend and resume that is a superset of a pcb. - Store the FPU state for suspend and resume in the new structure (for amd64, this moves it out of the PCB) - On both i386 and amd64, all of the FPU suspend/resume handling is now done in C.
Approved by: re (hrs)
|
#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
#
255217 |
|
04-Sep-2013 |
kib |
Tidy up some loose ends in the PCID code:
- Restore the pre-PCID TLB shootdown handlers for whole address space and single page invalidation asm code, and assign the IPI handler to them when PCID is not supported or disabled. Old handlers have linear control flow. But, still use the common return sequence.
- Stop using pcpu for INVPCID descriptors in the invlrg handler. It is enough to allocate descriptors on the stack. As result, two SWAPGS instructions are shaved off from the code for Haswell+.
- Fix the reverted condition in invlrng for checking of the PCID support [1], also in invlrng check that pmap is kernel pmap before performing other tests. For the kernel pmap, which provides global mappings, the INVLPG must be used for invalidation always.
- Save the pre-computed pmap' %CR3 register in the struct pmap. This allows to remove several checks for pm_pcid validity when %CR3 is reloaded [2].
Noted by: gibbs [1] Discussed with: alc [2] Tested by: pho, flo Sponsored by: The FreeBSD Foundation
|
#
255060 |
|
30-Aug-2013 |
kib |
Implement support for the process-context identifiers ('PCID') on Intel CPUs. The feature tags TLB entries with the Id of the address space and allows to avoid TLB invalidation on the context switch, it is available only in the long mode. In the microbenchmarks, using the PCID decreased latency of the context switches by ~30% on SandyBridge class desktop CPUs, measured with the lat_ctx program from lmbench.
If available, use INVPCID instruction when a TLB entry in non-current address space needs to be invalidated. The instruction is typically available on the Haswell.
If needed, the use of PCID can be turned off with the vm.pmap.pcid_enabled loader tunable set to 0. The state of the feature is reported by the vm.pmap.pcid_enabled sysctl. The sysctl vm.pmap.pcid_save_cnt reports the number of context switches which avoided invalidating the TLB; compare with the total number of context switches, available as sysctl vm.stats.sys.v_swtch.
Sponsored by: The FreeBSD Foundation Reviewed by: alc Tested by: pho, bf
|
#
251720 |
|
13-Jun-2013 |
neel |
Remove unused macros PTESHIFT, PDESHIFT, PDPESHIFT and PML4ESHIFT.
Reviewed by: alc
|
#
250423 |
|
09-May-2013 |
dchagin |
Retire write-only PCB_GS32BIT pcb flag on amd64.
|
#
236772 |
|
08-Jun-2012 |
iwasaki |
Add x86/acpica/acpi_wakeup.c for amd64 and i386. Difference of suspend/resume procedures are minimized among them.
common: - Add global cpuset suspended_cpus to indicate APs are suspended/resumed. - Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used). - Add some variables in acpi_wakecode.S in order to minimize the difference among amd64 and i386. - Disable load_cr3() because now CR3 is restored in resumectx().
amd64: - Add suspend/resume related members (such as MSR) in PCB. - Modify savectx() for above new PCB members. - Merge acpi_switch.S into cpu_switch.S as resumectx().
i386: - Merge(and remove) suspendctx() into savectx() in order to match with amd64 code.
Reviewed by: attilio@, acpi@
|
#
230426 |
|
21-Jan-2012 |
kib |
Add support for the extended FPU states on amd64, both for native 64bit and 32bit ABIs. As a side-effect, it enables AVX on capable CPUs.
In particular:
- Query the CPU support for XSAVE, list of the supported extensions and the required size of FPU save area. The hw.use_xsave tunable is provided for disabling XSAVE, and hw.xsave_mask may be used to select the enabled extensions.
- Remove the FPU save area from PCB and dynamically allocate the (run-time sized) user save area on the top of the kernel stack, right above the PCB. Reorganize the thread0 PCB initialization to postpone it after BSP is queried for save area size.
- The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as well. FPU state is only useful for suspend, where it is saved in dynamically allocated suspfpusave area.
- Use XSAVE and XRSTOR to save/restore FPU state, if supported and enabled.
- Define new mcontext_t flag _MC_HASFPXSTATE, indicating that mcontext_t has a valid pointer to out-of-struct extended FPU state. Signal handlers are supplied with stack-allocated fpu state. The sigreturn(2) and setcontext(2) syscall honour the flag, allowing the signal handlers to inspect and manipilate extended state in the interrupted context.
- The getcontext(2) never returns extended state, since there is no place in the fixed-sized mcontext_t to place variable-sized save area. And, since mcontext_t is embedded into ucontext_t, makes it impossible to fix in a reasonable way. Instead of extending getcontext(2) syscall, provide a sysarch(2) facility to query extended FPU state.
- Add ptrace(2) support for getting and setting extended state; while there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries.
- Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to consumers, making it opaque. Internally, struct fpu_kern_ctx now contains a space for the extended state. Convert in-kernel consumers of fpu_kern KPI both on i386 and amd64.
First version of the support for AVX was submitted by Tim Bird <tim.bird am sony com> on behalf of Sony. This version was written from scratch.
Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org> MFC after: 1 month
|
#
225475 |
|
11-Sep-2011 |
kib |
Perform amd64-specific microoptimizations for native syscall entry sequence. The effect is ~1% on the microbenchmark.
In particular, do not restore registers which are preserved by the C calling sequence. Align the jump target. Avoid unneeded memory accesses by calculating some data in syscall entry trampoline.
Reviewed by: jhb Approved by: re (bz) MFC after: 2 weeks
|
#
224187 |
|
18-Jul-2011 |
attilio |
- Remove the eintrcnt/eintrnames usage and introduce the concept of sintrcnt/sintrnames which are symbols containing the size of the 2 tables. - For amd64/i386 remove the storage of intr* stuff from assembly files. This area can be widely improved by applying the same to other architectures and likely finding an unified approach among them and move the whole code to be MI. More work in this area is expected to happen fairly soon.
No MFC is previewed for this patch.
Tested by: pluknet Reviewed by: jhb Approved by: re (kib)
|
#
221032 |
|
25-Apr-2011 |
rmacklem |
Fix the experimental NFS client so that it does not bogusly set the f_flags field of "struct statfs". This had the interesting effect of making the NFSv4 mounts "disappear" after r221014, since NFSMNT_NFSV4 and MNT_IGNORE became the same bit. Move the files used for a diskless NFS root from sys/nfsclient to sys/nfs in preparation for them to be used by both NFS clients. Also, move the declaration of the three global data structures from sys/nfsclient/nfs_vfsops.c to sys/nfs/nfs_diskless.c so that they are defined when either client uses them.
Reviewed by: jhb MFC after: 2 weeks
|
#
216634 |
|
21-Dec-2010 |
jkim |
Improve PCB flags handling and make it more robust. Add two new functions for manipulating pcb_flags. These inline functions are very similar to atomic_set_char(9) and atomic_clear_char(9) but without unnecessary LOCK prefix for SMP. Add comments about the rationale[1]. Use these functions wherever possible. Although there are some places where it is not strictly necessary (e.g., a PCB is copied to create a new PCB), it is done across the board for sake of consistency. Turn pcb_full_iret into a PCB flag as it is safe now. Move rarely used fields before pcb_flags and reduce size of pcb_flags to one byte. Fix some style(9) nits in pcb.h while I am in the neighborhood.
Reviewed by: kib Submitted by: kib[1] MFC after: 2 months
|
#
216253 |
|
07-Dec-2010 |
kib |
Retire write-only PCB_FULLCTX pcb flag on amd64.
Reminded by: Petr Salinger <Petr.Salinger seznam cz> Tested by: pho MFC after: 1 week
|
#
214631 |
|
01-Nov-2010 |
jhb |
Move <machine/apicreg.h> to <x86/apicreg.h>.
|
#
210780 |
|
02-Aug-2010 |
jkim |
Rearrange struct pcb. r177532 (CVS r1.64 of pcb.h) moved pcb_flags to make better use of cache lines by placing it before pcb_save (now pcb_user_save), which is moved to the end of pcb since r210777.
|
#
210777 |
|
02-Aug-2010 |
jkim |
- Merge savectx2() with savectx() and struct xpcb with struct pcb. [1] savectx() is only used for panic dump (dumppcb) and kdb (stoppcbs). Thus, saving additional information does not hurt and it may be even beneficial. Unfortunately, struct pcb has grown larger to accommodate more data. Move 512-byte long pcb_user_save to the end of struct pcb while I am here. - savectx() now saves FPU state unconditionally and copy it to the PCB of FPU thread if necessary. This gives panic dump and kdb a chance to take a look at the current FPU state even if the FPU is "supposedly" not used. - Resuming CPU now unconditionally reinitializes FPU. If the saved FPU state was irrelevant, it could be in an unknown state.
Suggested by: bde [1]
|
#
210614 |
|
29-Jul-2010 |
jkim |
Rename PCB_USER_FPU to PCB_USERFPU not to clash with a macro from fpu.h.
|
#
210514 |
|
26-Jul-2010 |
jkim |
Re-implement FPU suspend/resume for amd64. This removes superfluous uses of critical_enter(9) and critical_exit(9) by fpugetregs() and fpusetregs(). Also, we do not touch PCB flags any more.
MFC after: 1 month
|
#
195486 |
|
09-Jul-2009 |
kib |
Restore the segment registers and segment base MSRs for amd64 syscall return path only when neither thread was context switched while executing syscall code nor syscall explicitely modified LDT or MSRs.
Save segment registers in trap handlers before interrupts are enabled, to not allow context switches to happen before registers are saved. Use separated byte in pcb for indication of fast/full return, since pcb_flags are not synchronized with context switches.
The change puts back syscall microbenchmark numbers that were slowed down after commit of the support for LDT on amd64.
Reviewed by: jeff Tested (and tested, and tested ...) by: pho Approved by: re (kensmith)
|
#
195228 |
|
01-Jul-2009 |
dfr |
Don't include rpcv2.h - it has been removed.
Submitted by: ed@ Approved by: re
|
#
190629 |
|
01-Apr-2009 |
jkim |
Garbage collect unused MSR_GSBASE since r190620.
The only consumer was exception.S and specialreg.h is directly included now. Note no md5 changes were observed for all assym.s consumers with this.
|
#
190626 |
|
01-Apr-2009 |
jkim |
Garbage collect unused stack segment since r190620.
|
#
190620 |
|
01-Apr-2009 |
kib |
Save and restore segment registers on amd64 when entering and leaving the kernel on amd64. Fill and read segment registers for mcontext and signals. Handle traps caused by restoration of the invalidated selectors.
Implement user-mode creation and manipulation of the process-specific LDT descriptors for amd64, see sysarch(2).
Implement support for TSS i/o port access permission bitmap for amd64.
Context-switch LDT and TSS. Do not save and restore segment registers on the context switch, that is handled by kernel enter/leave trampolines now. Remove segment restore code from the signal trampolines for freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.
Implement amd64-specific compat shims for sysarch.
Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.
TODO: Currently, gdb is not adapted to show segment registers from struct reg. Also, no machine-depended ptrace command is added to set segment registers for debugged process.
In collaboration with: pho Discussed with: peter Reviewed by: jhb Linuxolator tested by: dchagin
|
#
189903 |
|
16-Mar-2009 |
jkim |
Initial suspend/resume support for amd64.
This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.
|
#
185991 |
|
12-Dec-2008 |
jkoshy |
Expose symbol `PMC_FN_USER_CALLCHAIN' to assembler code.
|
#
182868 |
|
08-Sep-2008 |
kib |
The pcb_gs32p should be per-cpu, not per-thread pointer. This is location in GDT where the segment descriptor from pcb_gs32sd is copied, and the location is in GDT local to CPU.
Noted and reviewed by: peter MFC after: 1 week
|
#
180992 |
|
30-Jul-2008 |
kib |
Bring back the save/restore of the %ds, %es, %fs and %gs registers for the 32bit images on amd64.
Change the semantic of the PCB_32BIT pcb flag to request the context switch code to operate on the segment registers. Its previous meaning of saving or restoring the %gs base offset is assigned to the new PCB_GS32BIT flag.
FreeBSD 32bit image activator sets the PCB_32BIT flag, while Linux 32bit emulation sets PCB_32BIT | PCB_GS32BIT.
Reviewed by: peter MFC after: 2 weeks
|
#
179049 |
|
16-May-2008 |
attilio |
Removed unused assembly offsets for structures digging.
|
#
177533 |
|
23-Mar-2008 |
peter |
Export TDP_KTHREAD to asm files.
|
#
173855 |
|
23-Nov-2007 |
jkoshy |
MFP4: Add assembly language symbols used by hwpmc(4)'s callchain capture.
|
#
172212 |
|
17-Sep-2007 |
peter |
Fix an undefined symbol that as/ld neglected to flag as a problem. It was used in assembler code in such a way that no unresolved relocation records were generated, so ld didn't flag the problem. You can see this with an 'nm' of the kernel. There will be 'U MAXCPU' on SMP systems.
The impact of this is that the intrcount/intrnames arrays do not have the intended amount of space reserved. This could lead to interesting problems due to the arrays being present in the middle of kernel code. An overflow would be rather interesting as executable code would be used as per-cpu incrementing interrupt counters.
This fixes it for now by exporting MAXCPU to the assembler. A better fix might be to define these data structures in C - they're only referenced in the kernel from C code these days anyway.
Approved by: re (kensmith)
|
#
172207 |
|
17-Sep-2007 |
jeff |
- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM.
Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
|
#
170368 |
|
06-Jun-2007 |
davidxu |
Backout experimental adaptive-spin umtx code.
|
#
170309 |
|
04-Jun-2007 |
jeff |
- Expose td_lock to assembly so it may be used in cpu_switch().
|
#
168035 |
|
29-Mar-2007 |
jkim |
MFP4: Linux set_thread_area syscall (aka TLS) support for amd64.
Initial version was submitted by Divacky Roman and mostly rewritten by me.
Tested by: emulation
|
#
165369 |
|
20-Dec-2006 |
davidxu |
Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count.
Tested on: Athlon64 X2 3800+, Dual Xeon 5130
|
#
164760 |
|
30-Nov-2006 |
jb |
Turn console printf buffering into a kernel option and only on by default for sun4v where it is absolutely required.
This change moves the buffer from struct pcpu to the stack to avoid using the critical section which created a LOR in a couple of cases due to interaction with the tty code and kqueue. The LOR can't be fixed with the critical section and the pcpu buffer can't be used without the critical section.
Putting the buffer on the stack was my initial solution, but it was pointed out that the stress on the stack might cause problems depending on the call path. We don't have a way of creating tests for those possible cases, so it's best to leave this as an option for the time being. In time we may get enough data to enable this option more generally.
|
#
163858 |
|
01-Nov-2006 |
jb |
Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time.
To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters.
String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices.
Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads.
A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb.
Reviewed by: scottl, jhb MFC after: 2 weeks
|
#
150647 |
|
27-Sep-2005 |
peter |
Kill pcb_rflags. It served no purpose.
Reported by: bde
|
#
149526 |
|
27-Aug-2005 |
jkoshy |
- Special-case NMI handling on the AMD64.
On entry or exit from the kernel the 'alltraps' and 'doreti' code used taken by normal traps disables interrupts to protect the critical sections where it is setting up %gs.
This protection is insufficient in the presence of NMIs since NMIs can be taken even when the processor has disabled normal interrupts. Thus the NMI handler needs to actually read MSR_GBASE on entry to the kernel to determine whether a swap of %gs using 'swapgs' is needed. However, reads of MSRs are expensive and integrating this check into the 'alltraps'/'doreti' path would penalize normal interrupts.
- Teach DDB about the 'nmi_calltrap' symbol.
Reviewed by: bde, peter (older versions of this change)
|
#
137917 |
|
20-Nov-2004 |
das |
Remove references to U area and garbage collect includes.
Reviewed by: arch@
|
#
129309 |
|
16-May-2004 |
peter |
Checkpoint some of what I was starting to tinker with for having some different context support for 32 vs 64 bit processes. This simply omits the save/restore of the segment selector registers for non 32 bit processes. This avoids the rdmsr/rwmsr juggling when restoring %gs clobbers the kernel msr that holds the gsbase.
However, I suspect it might be better to conditionally do this at user<->kernel transition where we wouldn't need to do the juggling in the first place. Or have per-thread extended context save/restore hooks.
|
#
127914 |
|
05-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999.
Approved by: core
|
#
125178 |
|
28-Jan-2004 |
peter |
Export PCB_DR* symbols
|
#
122940 |
|
21-Nov-2003 |
peter |
Cosmetic and/or trivial sync up with i386.
Approved by: re (rwatson)
|
#
122849 |
|
17-Nov-2003 |
peter |
Initial landing of SMP support for FreeBSD/amd64.
- This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.
|
#
120595 |
|
30-Sep-2003 |
jeff |
- Remove the definition for TD_SWITCHIN as it is not used.
Approved by: peter
|
#
118031 |
|
25-Jul-2003 |
obrien |
Use __FBSDID().
Brought to you by: a boring talk at Ottawa Linux Symposium
|
#
115251 |
|
23-May-2003 |
peter |
Major pmap rework to take advantage of the larger address space on amd64 systems. Of note: - Implement a direct mapped region using 2MB pages. This eliminates the need for temporary mappings when getting ptes. This supports up to 512GB of physical memory for now. This should be enough for a while. - Implement a 4-tier page table system. Most of the infrastructure is there for 128TB of userland virtual address space, but only 512GB is presently enabled due to a mystery bug somewhere. The design of this was heavily inspired by the alpha pmap.c. - The kernel is moved into the negative address space(!). - The kernel has 2GB of KVM available. - Provide a uma memory allocator to use the direct map region to take advantage of the 2MB TLBs. - Fixed some assumptions in the bus_space macros about the ability to fit virtual addresses in an 'int'.
Notable missing things: - pmap_growkernel() should be able to grow to 512GB of KVM by expanding downwards below kernbase. The kernel must be at the top 2GB of the negative address space because of gcc code generation strategies. - need to fix the >512GB user vm code.
Approved by: re (blanket)
|
#
115006 |
|
14-May-2003 |
peter |
Collect the nastiness for preserving the kernel MSR_GSBASE around the load_gs() calls into a single place that is less likely to go wrong.
Eliminate the per-process context switching of MSR_GSBASE, because it should be constant for a single cpu. Instead, save/restore it during the loading of the new %gs selector for the new process.
Approved by: re (amd64/* blanket)
|
#
114987 |
|
14-May-2003 |
peter |
Add BASIC i386 binary support for the amd64 kernel. This is largely stolen from the ia64/ia32 code (indeed there was a repocopy), but I've redone the MD parts and added and fixed a few essential syscalls. It is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic) and p4. The ia64 code has not implemented signal delivery, so I had to do that.
Before you say it, yes, this does need to go in a common place. But we're in a freeze at the moment and I didn't want to risk breaking ia64. I will sort this out after the freeze so that the common code is in a common place.
On the AMD64 side, this required adding segment selector context switch support and some other support infrastructure. The %fs/%gs etc code is hairy because loading %gs will clobber the kernel's current MSR_GSBASE setting. The segment selectors are not used by the kernel, so they're only changed at context switch time or when changing modes. This still needs to be optimized.
Approved by: re (amd64/* blanket)
|
#
114952 |
|
12-May-2003 |
peter |
For the page fault handler, save %cr2 in the outer trap handler so that we do not have to run so long with interrupts disabled. This involved creating tf_addr in the trapframe. Reorganize the trap stubs so that they consistently reserve the stack space and initialize any missing bits.
Approved by: re (amd64 stuff)
|
#
114928 |
|
12-May-2003 |
peter |
Give a %fs and %gs to userland. Use swapgs to obtain the kernel %GS.base value on entry and exit. This isn't as easy as it sounds because when we recursively trap or interrupt, we have to avoid duplicating the swapgs instruction or we end up back with the userland %gs. I implemented this by testing TF_CS to see if we're coming from supervisor mode already, and check for returning to supervisor. To avoid a race with interrupts in the brief period after beginning executing the handler and before the swapgs, convert all trap gates to interrupt gates, and reenable interrupts immediately after the swapgs. I am not happy with this. There are other possible ways to do this that should be investigated. (eg: storing the GS.base MSR value in the trapframe)
Add some sysarch functions to let the userland code get to this.
Approved by: re (blanket amd64/*)
|
#
114918 |
|
11-May-2003 |
peter |
Export PML4SHIFT and PDPSHIFT
Approved by: re (blanket amd64/*)
|
#
114349 |
|
30-Apr-2003 |
peter |
Commit MD parts of a loosely functional AMD64 port. This is based on a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.
|
#
113621 |
|
17-Apr-2003 |
jhb |
Remove a couple of unused symbols.
|
#
113339 |
|
10-Apr-2003 |
julian |
Move the _oncpu entry from the KSE to the thread. The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.
|
#
111372 |
|
23-Feb-2003 |
jake |
Previous commit missed a 1 that should be NGPTD, and an NPDEPG that should be NPDEPTD. Grumble.
Sponsored by: DARPA, Network Associates Laboratories
|
#
111363 |
|
23-Feb-2003 |
jake |
- Added macros NPGPTD, NBPTD, and NPDEPTD, for dealing with the size of the page directory. - Use these instead of the magic constants 1 or PAGE_SIZE where appropriate. There are still numerous assumptions that the page directory is exactly 1 page.
Sponsored by: DARPA, Network Associates Laboratories
|
#
111299 |
|
23-Feb-2003 |
jake |
- Added macros PDESHIFT and PTESHIFT, use these instead of magic constants in locore. - Removed the macros PTESIZE and PDESIZE, use sizeof instead in C.
Sponsored by: DARPA, Network Associates Laboratories
|
#
111032 |
|
17-Feb-2003 |
julian |
Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind.
Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@
|
#
110190 |
|
01-Feb-2003 |
julian |
Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but them's the rules..
I am using my "David's mentor" hat to revert this as he's offline for a while.
|
#
109877 |
|
26-Jan-2003 |
davidxu |
Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
|
#
107521 |
|
02-Dec-2002 |
deischen |
Align the FPU state in the ucontext and sigcontext to 16 bytes to accomodate the new SSE/XMM floating point save/restore instructions.
This commit is mostly from bde and includes some style nits.
Approved by: re (jhb)
|
#
106542 |
|
06-Nov-2002 |
davidxu |
1.Fix smp race between kernel vm86 BIOS calling and userland vm86 mode code, remove global variable in_vm86call, set vm86 calling flag in PCB flags.
2.Fix vm86 BIOS calling preempted problem by changing vm86_lock mutex type from MTX_DEF to MTX_SPIN. vm86pcb is not remembered in thread struct, when the thread calling vm86 BIOS is preempted by interrupt thread, and later switching back to the thread would cause incorrect context be loaded into CPU registers, this leads to kernel crash.
|
#
105950 |
|
25-Oct-2002 |
peter |
Split 4.x and 5.x signal handling so that we can keep 4.x signal handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *' to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re
|
#
103407 |
|
16-Sep-2002 |
mini |
Add kernel support needed for the KSE-aware libpthread: - Maintain fpu state across signals. - Use ucontext_t's to store KSE thread state. - Synthesize state for the UTS upon each upcall, rather than saving and copying a trapframe. - Save and restore FPU state properly in ucontext_t's.
Reviewed by: deischen, julian Approved by: -arch
|
#
99890 |
|
12-Jul-2002 |
dillon |
Re-enable the idle page-zeroing code. Remove all IPIs from the idle page-zeroing code as well as from the general page-zeroing code and use a lazy tlb page invalidation scheme based on a callback made at the end of mi_switch.
A number of people came up with this idea at the same time so credit belongs to Peter, John, and Jake as well.
Two-way SMP buildworld -j 5 tests (second run, after stabilization) 2282.76 real 2515.17 user 704.22 sys before peter's IPI commit 2266.69 real 2467.50 user 633.77 sys after peter's commit 2232.80 real 2468.99 user 615.89 sys after this commit
Reviewed by: peter, jhb Approved by: peter
|
#
99887 |
|
12-Jul-2002 |
jhb |
Set the thread state of the newly chosen to run thread to TDS_RUNNING in choosethread() in MI C code instead of doing it in in assembly in all the various cpu_switch() functions. This fixes problems on ia64 and sparc64.
Reviewed by: julian, peter, benno Tested on: i386, alpha, sparc64
|
#
99742 |
|
10-Jul-2002 |
dillon |
Remove the critmode sysctl - the new method for critical_enter/exit (already the default) is now the only method for i386.
Remove the paraphanalia that supported critmode. Remove td_critnest, clean up the assembly, and clean up (mostly remove) the old junk from cpu_critical_enter() and cpu_critical_exit().
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
93264 |
|
27-Mar-2002 |
dillon |
Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up).
Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement.
This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical*() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary.
Reviewed by: core Approved by: core
|
#
91328 |
|
26-Feb-2002 |
dillon |
revert last commit temporarily due to whining on the lists.
|
#
91315 |
|
26-Feb-2002 |
dillon |
STAGE-1 of 3 commit - allow (but do not require) interrupts to remain enabled in critical sections and streamline critical_enter() and critical_exit().
This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change.
This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs.
This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_*() functions are moved entirely into MD files.
The following changes have been made:
* critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section.
Other architectures are unaffected.
* fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code.
The new code places the burden of entering the critical section in the trampoline code where it belongs.
* I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask.
* In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made.
* Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags.
* FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock.
* ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts.
* Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them.
Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code.
* Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code.
Performance issues:
One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant.
The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer.
The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example).
The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit().
Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
|
#
90776 |
|
17-Feb-2002 |
deischen |
Use struct __ucontext in prototypes and associated functions instead of ucontext_t. Forward declare struct __ucontext in <sys/signal.h> and remove reliance on <sys/ucontext.h> being included.
While I'm here, also hide osigcontext types from userland; suggested by bde.
Namespace pollution noticed by: Kevin Day <toasty@shell.dragondata.com>
|
#
90344 |
|
07-Feb-2002 |
phk |
GC the PC_SWITCH* symbols which are not used in assembly anymore.
|
#
88088 |
|
17-Dec-2001 |
jhb |
Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
|
#
87702 |
|
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
#
85449 |
|
24-Oct-2001 |
jhb |
Split the per-process Local Descriptor Table out of the PCB and into struct mdproc.
Submitted by: Andrew R. Reiter <arr@watson.org> Silence on: -current
|
#
84615 |
|
07-Oct-2001 |
nyan |
Rewrite the pc98 bus_space stuff.
The type of bus_space_tag_t is now a pointer to bus_space_tag structure, and the bus_space_tag structure saves pointers to functions for direct access and relocate access.
Added bsh_bam member to the bus_space_handle structure, it saves access method either direct access or relocate access which is called by bus_space_* functions.
Added the mecia device support. If the bs_da and bs_ra in bus tag are set NEPC_io_space_tag and NEPC_mem_space_tag respectively, new bus_space stuff changes the register of mecia automatically for 16bit access.
Obtained from: NetBSD/pc98
|
#
83651 |
|
18-Sep-2001 |
peter |
Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82309 |
|
25-Aug-2001 |
peter |
Optionize UPAGES for the i386. As part of this I split some of the low level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc.
This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.
|
#
79609 |
|
12-Jul-2001 |
peter |
Activate SSE/SIMD. This is the extra context switching support that we are required to do if we let user processes use the extra 128 bit registers etc.
This is the base part of the diff I got from: http://www.issei.org/issei/FreeBSD/sse.html I believe this is by: Mr. SUZUKI Issei <issei@issei.org> SMP support apparently by: Takekazu KATO <kato@chino.it.okayama-u.ac.jp> Test code by: NAKAMURA Kazushi <kaz@kobe1995.net>, see http://kobe1995.net/~kaz/FreeBSD/SSE.en.html
I have fixed a couple of style(9) deviations. I have some followup commits to fix a couple of non-style things.
|
#
76117 |
|
29-Apr-2001 |
grog |
Revert consequences of changes to mount.h, part 2.
Requested by: bde
|
#
75858 |
|
23-Apr-2001 |
grog |
Correct #includes to work with fixed sys/mount.h.
|
#
75256 |
|
06-Apr-2001 |
jhb |
Axe the per-cpu variable witness_spin_check as it was replaced by the per-cpu spinlocks list.
|
#
74902 |
|
28-Mar-2001 |
jhb |
Catch up to the mtx_saveintr -> mtx_savecrit change.
|
#
72930 |
|
22-Feb-2001 |
peter |
Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago.
Reviewed by: jhb, imp
|
#
72746 |
|
20-Feb-2001 |
jhb |
- Don't call clear_resched() in userret(), instead, clear the resched flag in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.
|
#
72376 |
|
11-Feb-2001 |
jake |
Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
|
#
72276 |
|
10-Feb-2001 |
jhb |
- Make astpending and need_resched process attributes rather than CPU attributes. This is needed for AST's to be properly posted in a preemptive kernel. They are backed by two new flags in p_sflag: PS_ASTPENDING and PS_NEEDRESCHED. They are still accesssed by their old macros: aston(), astoff(), etc. For completeness, an astpending() macro has been added to check for a pending AST, and clear_resched() has been added to clear need_resched(). - Rename syscall2() on the x86 back to syscall() to be consistent with other architectures.
|
#
71817 |
|
30-Jan-2001 |
peter |
Remove unused GD_CPU_LOCKID, GD_OTHER_CPUS, PS_IDLESTACK and PS_IDLESTACK_TOP
|
#
71337 |
|
21-Jan-2001 |
jake |
Make intr_nesting_level per-process, rather than per-cpu. Setup interrupt threads to run with it always >= 1, so that malloc can detect M_WAITOK from "interrupt" context. This is also necessary in order to context switch from sched_ithd() directly.
Reviewed By: peter
|
#
71318 |
|
21-Jan-2001 |
jake |
Remove the per-cpu pages used for copy and zero-ing pages of memory for SMP; just use the same ones as UP. These weren't used without holding Giant anyway, and the routines that use them would have to be protected from pre-emption to avoid migrating cpus.
|
#
71294 |
|
20-Jan-2001 |
jake |
Rename the ASSYM MTX_RECURSE to MTX_RECURSECNT in order to not conflict with the flag of the same name.
|
#
71292 |
|
20-Jan-2001 |
jake |
Simplify the i386 asm MTX_{ENTER,EXIT} macros to just call the appropriate function, rather than doing a horse-and-buggy acquire. They now take the mutex type as an arg and can be used with sleep as well as spin mutexes.
|
#
70952 |
|
12-Jan-2001 |
jake |
Remove unused per-cpu variables inside_intr and ss_eflags.
|
#
70714 |
|
06-Jan-2001 |
jake |
Use %fs to access per-cpu variables in uni-processor kernels the same as multi-processor kernels. The old way made it difficult for kernel modules to be portable between uni-processor and multi-processor kernels. It is no longer necessary to jump through hoops.
- always load %fs with the private segment on entry to the kernel - change the type of the self referntial pointer from struct privatespace to struct globaldata - make the globaldata symbol have value 0 in all cases, so the symbols in globals.s are always offsets, not aliases for fields in globaldata - define the globaldata space used for uniprocessor kernels in C, rather than assembler - change the assmebly language accessors to use %fs, add a macro PCPU_ADDR(member, reg), which loads the register reg with the address of the per-cpu variable member
|
#
70006 |
|
14-Dec-2000 |
jake |
Use _lapic+offset to access the local apic from assembly language files, rather than the symbols in globals.s. The offsets are generated by genassym.
|
#
69919 |
|
12-Dec-2000 |
jhb |
Add in symbols needed in the WITNESS_ENTER and WITNESS_EXIT macros in i386/include/mutex.h.
|
#
67899 |
|
29-Oct-2000 |
phk |
Remove unneeded <stddef.h> #includes.
|
#
67358 |
|
20-Oct-2000 |
jhb |
- machine/mutex.h -> sys/mutex.h - Catch up to the MI mutex structure due to saveflags,saveipl,savepsr becoming saveintr.
|
#
67011 |
|
12-Oct-2000 |
bde |
Moved the definitions of AST_PENDING and AST_RESCHED to the correct place.
|
#
66716 |
|
06-Oct-2000 |
jhb |
- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast().
Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())
|
#
65557 |
|
06-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
60041 |
|
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
#
58717 |
|
28-Mar-2000 |
dillon |
Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it).
A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated.
Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion.
This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur.
This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later.
Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
|
#
58345 |
|
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
#
55604 |
|
08-Jan-2000 |
bde |
Compile genassym.c with ordinary ${CFLAGS}. The (small) needs for ${GEN_CFLAGS} and -U_KERNEL became negative when all all the genassym.c's were converted to be cross-built.
Makefile.*: - Cleanups associated with the old genassym. - Fixed deprecated spelling of ${.IMPSRC} as "$<".
|
#
55545 |
|
07-Jan-2000 |
marcel |
Use genassym(1). The definitions of NKPDE and NKPT have been removed because they are already defined in pmap.h, resulting in duplicate definitions.
Reviewed by: bde
|
#
55205 |
|
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
#
54188 |
|
06-Dec-1999 |
luoqi |
User ldt sharing.
|
#
52140 |
|
11-Oct-1999 |
luoqi |
Add a per-signal flag to mark handlers registered with osigaction, so we can provide the correct context to each signal handler.
Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK as we did before the linuxthreads support merge (submitted by bde).
Move ps_sigstk from to p_sigacts to the main proc structure since signal stack should not be shared among threads.
Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag. Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag.
Reviewed by: marcel, jdp, bde
|
#
51984 |
|
07-Oct-1999 |
marcel |
Simplification of the signal trampoline and other cleanups.
o Remove unused defines from genassym.c that were needed by the trampoline. o Add load_gs_param function to support.s that catches a fault when %gs is loaded with an invalid descriptor. The function returns EFAULT in that case. o Remove struct trapframe from mcontext_t and replace it with the list of registers. o Modify sendsig and sigreturn accordingly.
This commit contains a patch by bde.
Reviewed by: luoqi, jdp
|
#
51942 |
|
04-Oct-1999 |
marcel |
Re-introduction of sigcontext.
struct sigcontext and ucontext_t/mcontext_t are defined in such a way that both (ie struct sigcontext and ucontext_t) can be passed on to sigreturn. The signal handler is still given a ucontext_t for maximum flexibility.
For backward compatibility sigreturn restores the state for the alternate signal stack from sigcontext.sc_onstack and not from ucontext_t.uc_stack. A good way to determine which value the application has set and thus which value to use, is still open for discussion.
NOTE: This change should only affect those binaries that use sigcontext and/or ucontext_t. In the source tree itself this is only doscmd. Recompilation is required for those applications.
This commit also fixes a lot of style bugs without hopefully adding new ones.
NOTE: struct sigaltstack.ss_size now has type size_t again. For some reason I changed that into unsigned int.
Parts submitted by: bde sigaltstack bug found by: bde
|
#
51792 |
|
29-Sep-1999 |
marcel |
sigset_t change (part 3 of 5) -----------------------------
By introducing a new sigframe so that the signal handler operates on the new siginfo_t and on ucontext_t instead of sigcontext, we now need two version of sendsig and sigreturn.
A flag in struct proc determines whether the process expects an old sigframe or a new sigframe. The signal trampoline handles which sigreturn to call. It does this by testing for a magic cookie in the frame.
The alpha uses osigreturn to implement longjmp. This means that osigreturn is not only used for compatibility with existing binaries. To handle the new sigset_t, setjmp saves it in sc_reserved (see NOTE).
the struct sigframe has been moved from frame.h to sigframe.h to handle the complex header dependencies that was caused by the new sigframe.
NOTE: For the i386, the size of jmp_buf has been increased to hold the new sigset_t. On the alpha this has been prevented by using sc_reserved in sigcontext.
|
#
51065 |
|
07-Sep-1999 |
luoqi |
Save %gs in sigcontext when delivering a signal and restore them upon return (in signal trampoline code). I plan to do the same on -stable, so that we have a consistent interface to userland applications.
Reviewed by: bde
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
50037 |
|
18-Aug-1999 |
peter |
Update for MI switch code, and trim a heap of unused (I believe) entries.
|
#
49207 |
|
29-Jul-1999 |
peter |
GBIOSSTACK_SEL is undefined, but OTOH, BSSSEL apparently isn't used either.
|
#
49204 |
|
29-Jul-1999 |
msmith |
Remove some duplicate definitions, as suggested by Alan Cox.
|
#
48691 |
|
09-Jul-1999 |
jlemon |
Implement support for hardware debug registers on the i386.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
#
48621 |
|
06-Jul-1999 |
cracauer |
Implement SA_SIGINFO for i386. Thanks to Bruce Evans for much more than a review, this was a nice puzzle.
This is supposed to be binary and source compatible with older applications that access the old FreeBSD-style three arguments to a signal handler.
Except those applications that access hidden signal handler arguments bejond the documented third one. If you have applications that do, please let me know so that we take the opportunity to provide the functionality they need in a documented manner.
Also except application that use 'struct sigframe' directly. You need to recompile gdb and doscmd. `make world` is recommended.
Example program that demonstrates how SA_SIGINFO and old-style FreeBSD handlers (with their three args) may be used in the same process is at http://www3.cons.org/tmp/fbsd-siginfo.c
Programs that use the old FreeBSD-style three arguments are easy to change to SA_SIGINFO (although they don't need to, since the old style will still work):
Old args to signal handler: void handler_sn(int sig, int code, struct sigcontext *scp)
New args: void handler_si(int sig, siginfo_t *si, void *third) where: old:code == new:second->si_code old:scp == &(new:si->si_scp) /* Passed by value! */
The latter is also pointed to by new:third, but accessing via si->si_scp is preferred because it is type-save.
FreeBSD implementation notes: - This is just the framework to make the interface POSIX compatible. For now, no additional functionality is provided. This is supposed to happen now, starting with floating point values. - We don't use 'sigcontext_t.si_value' for now (POSIX meant it for realtime-related values). - Documentation will be updated when new functionality is added and the exact arguments passed are determined. The comments in sys/signal.h are meant to be useful.
Reviewed by: BDE
|
#
48308 |
|
28-Jun-1999 |
peter |
Use the same -UKERNEL strategy as the alpha to avoid the inlines etc.
|
#
47678 |
|
01-Jun-1999 |
jlemon |
Unifdef VM86.
Reviewed by: silence on on -current
|
#
47081 |
|
12-May-1999 |
luoqi |
Unbreak VESA on SMP.
|
#
47080 |
|
12-May-1999 |
luoqi |
VM86_FRAMESIZE is now the size of vm86 frame, not the number of 4-byte words.
Requested by: Bruce
|
#
47022 |
|
11-May-1999 |
luoqi |
Do not hardcode size of struct vm86frame.
Submitted by: Jonathan Lemon <jlemon@americantv.com>
|
#
46129 |
|
27-Apr-1999 |
luoqi |
Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup.
Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
|
#
45252 |
|
02-Apr-1999 |
alc |
Put in place the infrastructure for improved UP and SMP TLB management.
In particular, replace the unused field pmap::pm_flag by pmap::pm_active, which is a bit mask representing which processors have the pmap activated. (Thus, it is a simple Boolean on UPs.)
Also, eliminate an unnecessary memory reference from cpu_switch() in swtch.s.
Assisted by: John S. Dyson <dyson@iquest.net> Tested by: Luoqi Chen <luoqi@watermarkgroup.com>, Poul-Henning Kamp <phk@critter.freebsd.dk>
|
#
44327 |
|
28-Feb-1999 |
bde |
Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu, not per-process. Keep it in `switchtime' consistently.
It is now clear that the timestamp is always valid in fork_trampoline() except when the child is running on a previously idle cpu, which can only happen if there are multiple cpus, so don't check or set the timestamp in fork_trampoline except in the (i386) SMP case. Just remove the alpha code for setting it unconditionally, since there is no SMP case for alpha and the code had rotted.
Parts reviewed by: dfr, phk
|
#
44215 |
|
22-Feb-1999 |
bde |
Added a per-cpu variable `switchticks' for use in scheduling.
|
#
40081 |
|
08-Oct-1998 |
msmith |
Fix up the kernel environment and module data pointers in the bootinfo if they are present. If we are told where the end of the loaded kernel image is, believe it.
|
#
39613 |
|
24-Sep-1998 |
bde |
Don't redefine kernel. Makefile.i386 now defines it. Removed some unused includes.
|
#
38422 |
|
18-Aug-1998 |
msmith |
Presently there is only one `currentldt' variable for all cpus in a SMP system. Unexpected things could happen if each cpu has a different ldt setting and one cpu tries to use value of currentldt set by another cpu.
The fix is to move currentldt to the per-cpu area. It includes patches I filed in PR i386/6219 which are also user ldt related.
PR: i386/7591, i386/6219 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
|
#
37564 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
Use offsetof() instead of null pointer hacks.
|
#
36441 |
|
28-May-1998 |
phk |
Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.
Clean up (or if antipodic: down) some of the msgbuf stuff.
Use an inline function rather than a macro for timecounter delta.
Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead.
Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch()
This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems.
WARNING: Programs which muck about with struct proc in userland will have to be fixed.
Reviewed, but found imperfect by: bde
|
#
36138 |
|
17-May-1998 |
tegge |
Change simple lock handling to not depend upon having a local apic available. The per-cpu variable ss_tpr has been replaced by ss_eflags. This reduced the number of interrupts sent to the wrong CPU, due to the cpu having the global lock being inside a critical region.
Remove some unneeded manipulation of tpr register in mplock.s.
Adjust code in mplock.s to be aware of variables on the stack being destroyed by MPgetlock if GRAB_LOPRIO is defined.
|
#
36135 |
|
17-May-1998 |
tegge |
Add forwarding of roundrobin to other cpus. This gives a more regular update of cpu usage as shown by top when one process is cpu bound (no system calls) while the system is otherwise idle (except for top).
Don't attempt to switch to the BSP in boot(). If the system was idle when an interrupt caused a panic, this won't work. Instead, switch to the BSP in cpu_reset.
Remove some spurious forward_statclock/forward_hardclock warnings.
|
#
36125 |
|
17-May-1998 |
tegge |
For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1. get_ptbase and pmap_pte_quick no longer generates IPIs. This should reduce the number of IPIs during heavy paging.
|
#
35087 |
|
06-Apr-1998 |
peter |
Fix VM86 compiles. a #include "opt_vm86.h" was missing, and the my_tr variable was needed in the non-SMP case.
Submitted by: Jonathan Lemon <jlemon@americantv.com>
|
#
35071 |
|
06-Apr-1998 |
peter |
Generate #defines that the asm code can access for the per-cpu data structures.
|
#
35029 |
|
04-Apr-1998 |
phk |
Time changes mark 2:
* Figure out UTC relative to boottime. Four new functions provide time relative to boottime.
* move "runtime" into struct proc. This helps fix the calcru() problem in SMP.
* kill mono_time.
* add timespec{add|sub|cmp} macros to time.h. (XXX: These may change!)
* nanosleep, select & poll takes long sleeps one day at a time
Reviewed by: bde Tested by: ache and others
|
#
32992 |
|
01-Feb-1998 |
bde |
Declare printf() instead of including <stdio.h>, so that this doesn't depend on anything outside of "sys".
Removed an unused include.
Don't use `extern' in a function declaration.
|
#
30273 |
|
10-Oct-1997 |
peter |
GPROC0_SEL isn't used in any *.s files it seems..
|
#
30265 |
|
10-Oct-1997 |
peter |
Convert the VM86 option from a global option to an option only depended on by the files that use it. Changing the VM86 option now only causes a recompile of a dozen files or so rather than the entire kernel.
|
#
27993 |
|
08-Aug-1997 |
dyson |
VM86 kernel support. Work done by BSDI, Jonathan Lemon <jlemon@americantv.com>, Mike Smith <msmith@gsoft.com.au>, Sean Eric Fagan <sef@kithrup.com>, and probably alot of others. Submitted by: Jnathan Lemon <jlemon@americantv.com>
|
#
26494 |
|
07-Jun-1997 |
bde |
Preserve %fs and %gs across context switches. This has a relatively low cost since it is only done in cpu_switch(), not for every exception. The extra state is kept in the pcb, and handled much like the npx state, with similar deficiencies (the state is not preserved across signal handlers, and error handling loses state).
|
#
25648 |
|
10-May-1997 |
bde |
Cleaned up #includes. Lite2 cleaned up <sys/mount.h> so no kludges are required for NFS now.
Ifdefed SMP #defines.
|
#
25164 |
|
26-Apr-1997 |
peter |
Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to come over the next few days.
The kernel should run pretty much "as before" without the options to activate SMP mode.
There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment.
This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
24691 |
|
07-Apr-1997 |
peter |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
24690 |
|
07-Apr-1997 |
peter |
No longer use an i386tss as the basis of our pcb - it wasn't particularly convenient and makes life difficult for my next commit. We still need an i386tss to point to for the tss slot in the gdt, so we use a common tss shared between all processes.
Note that this is going to break debugging until this series of commits is finished. core dumps will change again too. :-( we really need a more modern core dump format that doesn't depend on the pcb/upages.
This change makes VM86 mode harder, but the following commits will remove a lot of constraints for the VM86 system, including the possibility of extending the pcb for an IO port map etc.
Obtained from: bde
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
18892 |
|
12-Oct-1996 |
bde |
Removed nested include if <sys/socket.h> from <net/if.h> and <net/if_arp.h> and fixed the things that depended on it. The nested include just allowed unportable programs to compile and made my simple #include checking program report that networking code doesn't need to include <sys/socket.h>.
|
#
17371 |
|
31-Jul-1996 |
bde |
Eliminated pcb_inl. It was always 0 because context switches don't occur in interrupt handlers.
|
#
17366 |
|
31-Jul-1996 |
dg |
Converted timer/run queues to 4.4BSD queue style. Removed old and unused sleep(). Implemented wakeup_one() which may be used in the future to combat the "thundering herd" problem for some special cases.
Reviewed by: dyson
|
#
15565 |
|
02-May-1996 |
phk |
Move atdevbase out of locore.s and into machdep.c Macroize locore.s' page table setup even more, now it's almost readable. Rename PG_U to PG_A (so that I can...) Rename PG_u to PG_U. "PG_u" was just too ugly... Remove some unused vars in pmap.c Remove PG_KR and PG_KW Remove SSIZE Remove SINCR Remove BTOPKERNBASE
This concludes my spring cleaning, modulus any bug fixes for messes I have made on the way.
(Funny to be back here in pmap.c, that's where my first significant contribution to 386BSD was... :-)
|
#
15543 |
|
02-May-1996 |
phk |
removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG
Major macro cleanup.
|
#
15231 |
|
13-Apr-1996 |
bde |
Generate #define of PCB_SAVEFPU_SIZE for use in savectx().
|
#
14836 |
|
27-Mar-1996 |
bde |
Eliminated dependency on opt_sysvipc.h.
|
#
14595 |
|
12-Mar-1996 |
dg |
Killed some historical #define cruft that we've never used in FreeBSD:
UDOT_SZ SYSPTSIZE USRPTSIZE MSGBUFPTECNT DMMIN DMMAX DMTEXT USRIOSIZE VM_PHYS_SIZE
|
#
13226 |
|
04-Jan-1996 |
wollman |
Convert SYSV IPC to new-style options. (I hope I got everything...) The LKMs will need an extra file, to come later.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12607 |
|
03-Dec-1995 |
bde |
Completed function declarations and/or added prototypes.
|
#
10092 |
|
17-Aug-1995 |
dg |
Killed some unused stuff inherited from Bill Jolitz. Note that since this changes the size of the pcb struct, gdb will need to be rebuilt or debugging won't work correctly.
Reviewed by: Bruce Evans
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8748 |
|
25-May-1995 |
dg |
Made "NMBCLUSTERS" calculation dynamic and fixed bogus use of "NMBCLUSTERS" in machdep.c (it should use the global nmbclusters). Moved the calculation of nmbclusters into conf/param.c (same place where nmbclusters has always been assigned), and made the calculation include an extra amount based on "maxusers". NMBCLUSTERS can still be overrided in the kernel config file as always, but this change will make that generally unnecessary. This fixes the "bug" reports from people who have misconfigured kernels seeing the network hang when the mbuf cluster pool runs out.
Reviewed by: John Dyson
|
#
6377 |
|
14-Feb-1995 |
phk |
Removed a YF comment.
|
#
6354 |
|
14-Feb-1995 |
phk |
Yves has sent us a ~600 Kb patch, which shuts up gcc entirely for the entire kernel. Unfortunately we didn't send him a copy of the style guide before he did it. I'm trying to find all the benign and downright sound bits and will commit them without any other explanation than "YF fix" if they are merely cosmetic.
Reviewed by: phk Submitted by: yves@dutncp8.tn.tudelft.nl (Yves Fonk)
|
#
5908 |
|
25-Jan-1995 |
bde |
Load the kernel symbol table in the boot loader and not at compile time. (Boot with the -D flag if you want symbols.)
Make it easier to extend `struct bootinfo' without losing either forwards or backwards compatibility.
ddb_aout.c: Get the symbol table from wherever the loader put it. Nuke db_symtab[SYMTAB_SPACE].
boot.c: Enable loading of symbols. Align them on a page boundary. Add printfs about the symbol table sizes. Pass the memory sizes to the kernel. Fix initialization of `unit' (it got moved out of the loop). Fix adding the bss size (it got moved inside an ifdef). Initialize serial port when RB_SERIAL is toggled on. Fix comments. Clean up formatting of recently added code.
io.c: Clean up formatting of recently added code.
netboot/main.c, machdep.c, wd.c: Change names of bootinfo fields.
LINT: Nuke SYMTAB_SPACE. Fix comment about DODUMP.
Makefile.i386: Nuke use of dbsym. Exclude gcc symbols from kernel unless compiling with -g. Remove unused macro. Fix comments and formatting.
genassym.c: Generate defines for some new bootinfo fields. Change names of old ones.
locore.s: Copy only the valid part of the `struct bootinfo' passed by the loader. Reserve space for symbol table, if any.
machdep.c: Check the memory sizes passed by the loader, if any. Don't use them yet.
bootinfo.h: Add a size field so that we can resolve some mismatches between the loader bootinfo and the kernel boot info. The version number is not so good for this because of historical botches and because it's harder to maintain. Add memory size and symbol table fields. Change the names of everything.
Hacks to save a few bytes:
asm.S, boot.c, boot2.S: Replace `ouraddr' by `(BOOTSEG << 4)'.
boot.c: Don't statically initialize `loadflags' to 0. Disable the "REDUNDANT" code that skips the BIOS variables. Eliminate `total'. Combine some more printfs.
boot.h, disk.c, io.c, table.c: Move all statically initialzed data to table.c.
io.c: Don't put the A20 gate bits in a variable.
|
#
5770 |
|
21-Jan-1995 |
bde |
Remove unused definitions of vm statistics counters. Most of the counting is now done in C. There are still about 100 unused definitions for other things.
|
#
4929 |
|
03-Dec-1994 |
bde |
i386/exception.s, Keep track of interrupt nesting level. It is normally 0 for syscalls and traps, but is fudged to 1 for their exit processing in case they metamorphose into an interrupt handler.
i386/genassym.c; Remove support for the obsolete pcb_iml and pcb_cmap2.
Add support for pcb_inl.
i386/swtch.s: Fudge the interrupt nesting level across context switches and in the idle loop so that the work for preemptive context switches gets counted as interrupt time, the work for voluntary context switches gets counted mostly as system time (the part when curproc == 0 gets counted as interrupt time), and only truly idle time gets counted as idle time.
Remove obsolete support (commented out and otherwise) for pcb_iml.
Load curpcb just before curproc instead of just after so that curpcb is always valid if curproc is. A few more changes like this may fix tracing through context switches.
Remove obsolete function swtch_to_inactive().
include/cpu.h: Use the new interrupt nesting level variable to implement a non-fake CLF_INTR() so that accounting for the interrupt state works.
You can use top, iostat or (best) an up to date systat to see interrupt overheads. I see the expected huge interrupt overheads for ISA devices (on a 486DX/33, about 55% for an IDE drive transferring 1250K/sec and the same for a WD8013EBT network card transferring 1100K/sec). The huge interrupt overheads for serial devices are unfortunately normally invisible.
include/pcb.h: Remove the obsolete pcb_iml and pcb_cmap2. Replace them by padding to preserve binary compatibility.
Use part of the new padding for pcb_inl.
isa/icu.s: isa/vector.s: Keep track of interrupt nesting level.
|
#
4600 |
|
18-Nov-1994 |
phk |
Grap the bootinfo structure the bootblock passes us.
|
#
3919 |
|
26-Oct-1994 |
bde |
Fix compiler warnings.
|
#
3889 |
|
26-Oct-1994 |
jkh |
Fix two very minor nits, one of which caused a warning (no return type for main).
|
#
3612 |
|
15-Oct-1994 |
dg |
1) Some of the counters in the vmmeter struct don't fit well into the Mach VM scheme of things, so I've changed them to be more appropriate. page in/ous are now associated with the pager that did them. Nuked v_fault as the only fault of interest that wouldn't be already counted in v_trap is a VM fault, and this is counted seperately. 2) Implemented most of the remaining counters and corrected the counting of some that were done wrong. They are all almost correct now...just a few minor ones left to fix.
|
#
3451 |
|
09-Oct-1994 |
dg |
Got rid of map.h. It's a leftover from the rmap code, and we use rlists. Changed swapmap into swaplist.
|
#
3384 |
|
06-Oct-1994 |
rgrimes |
1. Eliminate unused esym global from locore, our boot code never supported that and when it does it will be done differently.
2. The kernel now does a frame setup on entry so it ``looks'' like a real function call. This will be needed by future boot code and debuggers.
3. Clean up stack offsets to all be in decimal and use %ebp when copying parameters in from the boot code.
4. Implement version 1 of the uniform boot code passing mechanism with support for kernelname passing and nfs_diskless structure passing.
5. Document the 3 different ways the kernel is called depending on what code is calling it.
|
#
3294 |
|
02-Oct-1994 |
rgrimes |
If you are building a kernel without NFS statically linked in you must #define NFS before including <sys/mount.h> to pick up some of the definitions needed for struct diskless. Be sure to undef it after this so you do not effect other code.
This is kinda sick, but it does the job. Problem found by davidg.
|
#
3291 |
|
02-Oct-1994 |
dg |
"idle priority" support. Based on code from Henrik Vestergaard Draboel, but substantially rewritten by me.
|
#
3283 |
|
01-Oct-1994 |
rgrimes |
Add code to generate NFSDISKLESS_SIZE for use in locore for copying the nfs_diskless structure in from the boot code. Reviewed by: davidg
|
#
2689 |
|
12-Sep-1994 |
dg |
Eliminated a whole pile of ancient (we're taking 4.3BSD) VM system related #define constants. Corrected incorrect VM_MAX_KERNEL_ADDRESS.
Reviewed by: John Dyson
|
#
2457 |
|
02-Sep-1994 |
dg |
Converted P_LINK -> P_FORW, P_RLINK -> P_BACK, minor optimization.
|
#
2441 |
|
01-Sep-1994 |
dg |
Realtime priority scheduling support.
Submitted by: Henrik Vestergaard Draboel
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
974 |
|
14-Jan-1994 |
dg |
"New" VM system from John Dyson & myself. For a run-down of the major changes, see the log of any effected file in the sys/vm directory (swap_pager.c for instance).
|
#
757 |
|
13-Nov-1993 |
dg |
First steps in rewriting locore.s, and making info useful when the machine panics.
i386/i386/locore.s: 1) got rid of most .set directives that were being used like #define's, and replaced them with appropriate #define's in the appropriate header files (accessed via genassym). 2) added comments to header inclusions and global definitions, and global variables 3) replaced some hardcoded constants with cpp defines (such as PDESIZE and others) 4) aligned all comments to the same column to make them easier to read 5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to /sys/i386/include/asmacros.h 6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code 7) added new global '_KERNend' to store last location+1 of kernel 8) cleaned up zeroing of bss so that only bss is zeroed 9) fix zeroing of page tables so that it really does zero them all - not just if they follow the bss. 10) rewrote page table initialization code so that 1) works correctly and 2) write protects the kernel text by default 11) properly initialize the kernel page directory, upages, p0stack PT, and page tables. The previous scheme was more than a bit screwy. 12) change allocation of virtual area of IO hole so that it is fixed at KERNBASE + 0xa0000. The previous scheme put it right after the kernel page tables and then later expected it to be at KERNBASE +0xa0000 13) change multiple bogus settings of user read/write of various areas of kernel VM - including the IO hole; we should never be accessing the IO hole in user mode through the kernel page tables 14) split kernel support routines such as bcopy, bzero, copyin, copyout, etc. into a seperate file 'support.s' 15) split swtch and related routines into a seperate 'swtch.s' 16) split routines related to traps, syscalls, and interrupts into a seperate file 'exception.s' 17) remove some unused global variables from locore that got inserted by Garrett when he pulled them out of some .h files.
i386/isa/icu.s: 1) clean up global variable declarations 2) move in declaration of astpending and netisr
i386/i386/pmap.c: 1) fix calculation of virtual_avail. It previously was calculated to be right in the middle of the kernel page tables - not a good place to start allocating kernel VM. 2) properly allocate kernel page dir/tables etc out of kernel map - previously only took out 2 pages.
i386/i386/machdep.c: 1) modify boot() to print a warning that the system will reboot in PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user abort with a key on the console. The machine will wait for ever if a key is typed before the reboot. The default is 15 seconds, but can be set to 0 to mean don't wait at all, -1 to mean wait forever, or any positive value to wait for that many seconds. 2) print "Rebooting..." just before doing it.
kern/subr_prf.c: 1) remove PANICWAIT as it is deprecated by the change to machdep.c
i386/i386/trap.c: 1) add table of trap type strings and use it to print a real trap/ panic message rather than just a number. Lot's of work to be done here, but this is the first step. Symbolic traceback is in the TODO.
i386/i386/Makefile.i386: 1) add support in to build support.s, exception.s and swtch.s
...and various changes to various header files to make all of the above happen.
|
#
608 |
|
15-Oct-1993 |
rgrimes |
genassym.c: Remove NKMEMCLUSTERS, it is no longer define or used.
locores.s: Fix comment on PTDpde and APTDpde to be pde instead of pte Add new equation for calculating location of Sysmap Remove Bill's old #ifdef garbage for counting up memory, that stuff will never be made to work and was just cluttering up the file.
Add code that places the PTD, page table pages, and kernel stack below the 640k ISA hole if there is room for it, otherwise put this stuff all at 1MB. This fixes the 28K bogusity in the boot blocks, that can now go away!
Fix the caclulation of where first is to be dependent on NKPDE so that we can skip over the above mentioned areas. The 28K thing is now 44K in size due to the increase in kernel virtual memory space, but since we no longer have to worry about that this is no big deal.
Use if NNPX > 0 instead of ifdef NPX for floating point code.
machdep.c Change the calculation of for the buffer cache to be 20% of all memory above 2MB and add back the upper limit of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL of the kernel memory space on large memory machines, note that this will not even come into effect unless you have more than 32MB. The current buffer cache limit is 6.7MB due to this caclulation.
It seems that we where erroniously allocating bufpages pages for buffer_map. buffer_map is UNUSED in this implementation of the buffer cache, but since the map is referenced in several if statements a quick fix was to simply allocate 1 vm page (but no real memory) to it.
pmap.h Remove rcsid, don't want them in the kernel files!
Removed some cruft inside an #ifdef DEBUGx that caused compiler errors if you where compiling this for debug.
Use the #defines for PD_SHIFT and PG_SHIFT in place of constants.
trap.c: Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Remove a now completly invalid check for a maximum virtual address, the virtual address now ends at 0xFFFFFFFF so there is no more MAX!! (Thanks David, I completly missed that one!)
vm_machdep.c Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Replace several 0xFE00000 constants with KERNBASE
|
#
590 |
|
12-Oct-1993 |
rgrimes |
Add Page Table Directory Indexes (NKPDE, KPTDI, PTDPTDI, APTDPTDI) to be used to replace more constants in locore.
|
#
567 |
|
10-Oct-1993 |
rgrimes |
Added PDRSHIFT and KERNSIZE so that the PDR offsets can be calculated in locore.s instead of being constants (3F8, 3FA).
|
#
556 |
|
08-Oct-1993 |
rgrimes |
All: removed patch kit headers and sccsids, add $Id$. This is a general clean up and reallignment with NetBSD-current where possible.
genassym.c: removed extranious include of reg.h removed old FP_* defines that have been ifdefed out since the patch kit removed PCB_SIGC that is not referenced anywhere add trapframe and sigframe defines add KERNBASE define for use in locore.s
locore.s: include npx.h and use NNPX for turning on and off FPU include machine/cputypes.h for the types of cpu (used in cpu_identify) change SYSPDREND to be one higher, this is really the base of the next area, and will be changing again next time I revise the file Reverse the NOP defines, you now get slow NOP's by default, this may be what is casuing us trouble with some systems. If you want the NOPS to be null you now need to have options DUMMY_NOPS. Now get esym from the boot blocks which don't pass it yet, and it is not used, but this will be changing. Move the bit_colors stuff to be in with the rest of Bruces SHOW_A_LOT things for debugging. Added NetBSD's CPU type probe code, we now know what type of CPU we are running on. Adjust kernel pde calcuation to correct for change in SYSPDREND, no longer need the +1.
machdep.c include npx.h and use NNPX for turning on and off FPU include isa.h, map.h(new file), exec.h in preperation for changes that are still in process. Add some of the code for MACHINE_NONCONTIG that will alow us to better map around the BIOS memory area. Now print the version, cpu id, real memory and availiable memory during boot. Correct the calculation of bufpages, the code was mixing pages and bytes, it now does the right things. Removed Bill's hack for limiting the erronous calculation. add the identifycpu print out code from NetBSD. remove the definition of the sigframe struct, it belongs in frame.h put in printf's about syncing disks on a halt/reboot. Change the halted message to be a little easier reading. Clean up of the dump messages, makes the source and the output much more readable. Change 0,0 in several places to have spaces after the commas.
|
#
5 |
|
12-Jun-1993 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r4, which included commits to RCS files with non-trunk default branches.
|
#
4 |
|
12-Jun-1993 |
rgrimes |
Initial import, 0.1 + pk 0.2.4-B1
|