#
362383 |
|
19-Jun-2020 |
kib |
MFC r362130: Control for Special Register Buffer Data Sampling mitigation.
|
#
354764 |
|
16-Nov-2019 |
scottl |
MFC r354759: TSX Asynchronous Abort mitigation for Intel CVE-2019-11135. This CVE has already been announced in FreeBSD SA-19:26.mcu.
Mitigation for TAA involves either turning off TSX or turning on the VERW mitigation used for MDS. Some CPUs will also be self-mitigating for TAA and require no software workaround.
Control knobs are: machdep.mitigations.taa.enable: 0 - no software mitigation is enabled 1 - attempt to disable TSX 2 - use the VERW mitigation 3 - automatically select the mitigation based on processor features.
machdep.mitigations.taa.state: inactive - no mitigation is active/enabled TSX disable - TSX is disabled in the bare metal CPU as well as - any virtualized CPUs VERW - VERW instruction clears CPU buffers not vulnerable - The CPU has identified itself as not being vulnerable
Nothing in the base FreeBSD system uses TSX. However, the instructions are straight-forward to add to custom applications and require no kernel support, so the mitigation is provided for users with untrusted applications and tenants.
Reviewed by: emaste, imp, kib, scottph Sponsored by: Intel Differential Revision: 22374
|
#
347700 |
|
16-May-2019 |
markj |
MFC r337715, r337751, r337754, r337758, r337813, r338354, r338687, r339124, r341821: Add support for boot-time Intel microcode loading.
|
#
347568 |
|
14-May-2019 |
kib |
MFC r347566: Mitigations for Microarchitectural Data Sampling.
Reference: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00233.html Security: CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091 Security: FreeBSD-SA-19:07.mds Reviewed by: jhb Tested by: emaste, lwhsu Approved by: so (gtetlow)
|
#
341491 |
|
04-Dec-2018 |
markj |
MFC r341442, r341443: Plug memory disclosures via ptrace(2).
|
#
338691 |
|
14-Sep-2018 |
jhb |
MFC 332454,334009,334122: Various fixes for x86 debug exceptions.
332454: Fix PSL_T inheritance on exec for x86.
The miscellaneous x86 sysent->sv_setregs() implementations tried to migrate PSL_T from the previous program to the new executed one, but they evaluated regs->tf_eflags after the whole regs structure was bzeroed. Make this functional by saving PSL_T value before zeroing.
Note that if the debugger is not attached, executing the first instruction in the new program with PSL_T set results in SIGTRAP, and since all intercepted signals are reset to default dispostion on exec(2), this means that non-debugged process gets killed immediately if PSL_T is inherited. In particular, since suid images drop P_TRACED, attempt to set PSL_T for execution of such program would kill the process.
Another issue with userspace PSL_T handling is that it is reset by trap(). It is reasonable to clear PSL_T when entering SIGTRAP handler, to allow the signal to be handled without recursion or delivery of blocked fault. But it is not reasonable to return back to the normal flow with PSL_T cleared. This is too late to change, I think.
334009: Cleanups related to debug exceptions on x86.
- Add constants for fields in DR6 and the reserved fields in DR7. Use these constants instead of magic numbers in most places that use DR6 and DR7. - Refer to T_TRCTRAP as "debug exception" rather than a "trace trap" as it is not just for trace exceptions. - Always read DR6 for debug exceptions and only clear TF in the flags register for user exceptions where DR6.BS is set. - Clear DR6 before returning from a debug exception handler as recommended by the SDM dating all the way back to the 386. This allows debuggers to determine the cause of each exception. For kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value to other parts of the handler (namely, user_dbreg_trap()). For user traps, wait until after trapsignal to clear DR6 so that userland debuggers can read DR6 via PT_GETDBREGS while the thread is stopped in trapsignal().
334122: x86: stop unconditionally clearing PSL_T on the trace trap.
We certainly should clear PSL_T when calling the SIGTRAP signal handler, which is already done by all x86 sendsig(9) ABI code. On the other hand, there is no obvious reason why PSL_T needs to be cleared when returning from the signal handler. For instance, Linux allows userspace to set PSL_T and keep tracing enabled for the desired period. There are userspace programs which would use PSL_T if we make it possible, for instance sbcl.
Remember if PSL_T was set by PT_STEP or PT_SETSTEP by mean of TDB_STEP flag, and only clear it when the flag is set.
|
#
337262 |
|
03-Aug-2018 |
markj |
MFC r336505, r336764 Have preload_delete_name() free pages backing preloaded data.
|
#
336963 |
|
31-Jul-2018 |
kib |
MFC r336683: Extend ranges of the critical sections to ensure that context switch code never sees FPU pcb flags not consistent with the hardware state.
|
#
335570 |
|
22-Jun-2018 |
kib |
MFC r333059 (by tychon): Expand the checks for UCR3 == PMAP_NO_CR3 to enable processes to be excluded from PTI.
|
#
334152 |
|
24-May-2018 |
kib |
MFC r334004: Add Intel Spec Store Bypass Disable control.
This also includes the i386/include/pcpu.h part of the r334018.
Security: CVE-2018-3639 Approved by: re (gjb)
|
#
333720 |
|
17-May-2018 |
kib |
MFC r333228 Implement support for ifuncs in the kernel linker on x86.
MFC r333411: Avoid calls to bzero() before ireloc
Approved by: re (marius)
|
#
333369 |
|
08-May-2018 |
emaste |
MFC r333368: Prepare DB# handler for deferred trigger of watchpoints.
Prepare DB# handler for deferred trigger of watchpoints.
Since pop %ss/mov %ss instructions defer all interrupts and exceptions for the next instruction, it is possible that the userspace watchpoint trap executes on the first instruction of the kernel entry for syscall/bpt.
In this case, DB# should be treated similarly to NMI: on amd64 we must always load GSBASE even if the trap comes from kernel mode, and load the kernel page table root into %cr3. Moreover, the trap must use the dedicated stack, because we are still on the user stack when trapped on syscall entry.
For i386, we must reload %cr3. The syscall instruction is not configured, so there is no issue with executing on user stack when trapping.
Due to some CPU erratas it is not always possible to detect that the userspace watchpoint triggered by inspecting %dr6. In trap(), compare the trap %rip with the known unsafe entry points and if matched pretend that the watchpoint did not fire at all.
Thank you to the MSRC Incident Response Team, and in particular Greg Lenti and Nate Warfield, for coordinating the response to this issue across multiple vendors.
Thanks to Computer Recycling at The Working Center of Kitchener for making hardware available to allow us to test the patch on additional CPU families.
Reviewed by: jhb Discussed with: Matthew Dillon Tested by: emaste Approved by: re (so blanket) Security: CVE-2018-8897 Security: FreeBSD-SA-18:06.debugreg Sponsored by: The FreeBSD Foundation
|
#
332427 |
|
12-Apr-2018 |
kib |
MFC r332060: Make the INTO instruction operational in 32bit mode.
|
#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
329462 |
|
17-Feb-2018 |
kib |
MFC r328083,328096,328116,328119,328120,328128,328135,328153,328157, 328166,328177,328199,328202,328205,328468,328470,328624,328625,328627, 328628,329214,329297,329365:
Meltdown mitigation by PTI, PCID optimization of PTI, and kernel use of IBRS for some mitigations of Spectre.
Tested by: emaste, Arshan Khanifar <arshankhanifar@gmail.com> Discussed with: jkim Sponsored by: The FreeBSD Foundation
|
#
328123 |
|
18-Jan-2018 |
kib |
MFC r327818: Move the hardware setup for fast syscalls into a common function.
|
#
327871 |
|
12-Jan-2018 |
kib |
MFC r327597: Make it possible to re-evaluate cpu_features.
|
#
325542 |
|
08-Nov-2017 |
kib |
MFC r325270: Consistently ensure that we do not load MXCSR with reserved bits set.
|
#
323431 |
|
11-Sep-2017 |
kib |
MFC r322762, r322799, r322832, r322833: Make WRFSBASE and WRGSBASE instructions functional.
Bump stable/11 __FreeBSD_version.
|
#
322490 |
|
14-Aug-2017 |
sephe |
MFC 322323 by jkim
Split identify_cpu() into two functions for amd64 as we do for i386. This reduces diff between amd64 and i386. Also, it fixes a regression introduced in r322076, i.e., identify_hypervisor() failed to identify some hypervisors. This function assumes cpu_feature2 is already initialized.
Reported by: dexuan Tested by: dexuan
|
#
322204 |
|
07-Aug-2017 |
jkim |
MFC: r322076
Detect hypervisor early so that we set lower hz on it.
|
#
317004 |
|
16-Apr-2017 |
mmel |
MFC r303261,r315059:
r303261: Add more UEFI/e820 memory types from latest specifications. r315059: Split overbloated machep.c to multiple files and do basic cleanup of these fragments.
|
#
314844 |
|
07-Mar-2017 |
kib |
MFC r314429: Initialize pcb_save for thread0.
|
#
310659 |
|
28-Dec-2016 |
kib |
MFC r304957, r304958, r306310 (by bde): Fix vm86 initialization.
MFC r310050: Improve very early trap handling on amd64.
|
#
308433 |
|
07-Nov-2016 |
jhb |
MFC 305836: Remove 'cpu' and 'cpu_class' on amd64.
The 'cpu' and 'cpu_class' variables were always set to the same value on amd64 and are legacy holdovers from i386. Remove them entirely on amd64.
Requested by: kib (MFC)
|
#
306405 |
|
28-Sep-2016 |
kib |
MFC r306092: Rename efi_systbl to efi_systbl_phys.
|
#
306316 |
|
25-Sep-2016 |
kib |
MFC r305942: Consolidate four efi_next_descriptor() definitions.
|
#
306078 |
|
21-Sep-2016 |
kib |
MFC r305939: Remove trailing space.
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
298308 |
|
19-Apr-2016 |
pfg |
X86: use our nitems() macro when it is avaliable through param.h.
No functional change, only trivial cases are done in this sweep,
Discussed in: freebsd-current
|
#
294930 |
|
27-Jan-2016 |
jhb |
Convert ss_sp in stack_t and sigstack to void *.
POSIX requires these members to be of type void * rather than the char * inherited from 4BSD. NetBSD and OpenBSD both changed their fields to void * back in 1998. No new build failures were reported via an exp-run.
PR: 206503 (exp-run) Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D5092
|
#
293045 |
|
02-Jan-2016 |
ian |
Make the 'env' directive described in config(5) work on all architectures, providing compiled-in static environment data that is used instead of any data passed in from a boot loader.
Previously 'env' worked only on i386 and arm xscale systems, because it required the MD startup code to examine the global envmode variable and decide whether to use static_env or an environment obtained from the boot loader, and set the global kern_envp accordingly. Most startup code wasn't doing so. Making things even more complex, some mips startup code uses an alternate scheme that involves calling init_static_kenv() to pass an empty buffer and its size, then uses a series of kern_setenv() calls to populate that buffer.
Now all MD startup code calls init_static_kenv(), and that routine provides a single point where envmode is checked and the decision is made whether to use the compiled-in static_kenv or the values provided by the MD code.
The routine also continues to serve its original purpose for mips; if a non-zero buffer size is passed the routine installs the empty buffer ready to accept kern_setenv() values. Now if the size is zero, the provided buffer full of existing env data is installed. A NULL pointer can be passed if the boot loader provides no env data; this allows the static env to be installed if envmode is set to do so.
Most of the work here is a near-mechanical change to call the init function instead of directly setting kern_envp. A notable exception is in xen/pv.c; that code was originally installing a buffer full of preformatted env data along with its non-zero size (like mips code does), which would have allowed kern_setenv() calls to wipe out the preformatted data. Now it passes a zero for the size so that the buffer of data it installs is treated as non-writeable.
|
#
292472 |
|
19-Dec-2015 |
imp |
Save the physical address passed into the kernel of the UEFI system table.
|
#
291948 |
|
07-Dec-2015 |
kib |
Use ANSI C definition.
MFC after: 1 week
|
#
287000 |
|
21-Aug-2015 |
royger |
preload_search_info: make sure mod is set
Add a check to preload_search_info to make sure mod is set. Most of the callers of preload_search_info don't check that the mod parameter is set, which can cause page faults. While at it, remove some now unnecessary checks before calling preload_search_info.
Sponsored by: Citrix Systems R&D Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D3440
|
#
286667 |
|
12-Aug-2015 |
marcel |
Better support memory mapped console devices, such as VGA and EFI frame buffers and memory mapped UARTs.
1. Delay calling cninit() until after pmap_bootstrap(). This makes sure we have PMAP initialized enough to add translations. Keep kdb_init() after cninit() so that we have console when we need to break into the debugger on boot. 2. Unfortunately, the ATPIC code had be moved as well so as to avoid a spurious trap #30. The reason for which is not known at this time. 3. In pmap_mapdev_attr(), when we need to map a device prior to the VM system being initialized, use virtual_avail as the KVA to map the device at. In particular, avoid using the direct map on amd64 because we can't demote by virtue of not being able to allocate yet. Keep track of the translation. Re-use the translation after the VM has been initialized to not waste KVA and to satisfy the assumption in uart(4) that the handle returned for the low-level console is the same as later returned when the device is probed and attached. 4. In pmap_unmapdev() remove the mapping from the table when called pre-init. Otherwise keep the mapping. During bus probe and attach device resources are mapped and unmapped multiple times, which would have us destroy the mapping used by the low-level console. 5. In pmap_init(), set pmap_initialized to signal that we're not pre-init anymore. On amd64, bring the direct map in sync with the translations created at that time. 6. Implement bus_space_map() and bus_space_unmap() for real: when the tag corresponds to memory space, call the corresponding pmap_mapdev() and pmap_unmapdev() functions to construct and actual handle. 7. In efifb.c and vt_vga.c, remove the crutches and hacks and simply call pmap_mapdev_attr() or bus_space_map() as desired.
Notes: 1. uart(4) already used bus_space_map() during low-level console setup but since serial ports have traditionally been I/O port based, the lack of a proper implementation for said function was not a problem. It has always supported memory mapped UARTs for low-level consoles by setting hw.uart.console accordingly. 2. The use of the direct map on amd64 without setting caching attributes has been a bigger problem than previously thought. This change has the fortunate (and unexpected) side-effect of fixing various EFI frame buffer problems (though not all).
PR: 191564, 194952
Special thanks to: 1. XipLink, Inc -- generously donated an Intel Bay Trail E3800 based eval board (ADLE3800PC). 2. The FreeBSD Foundation, in particular emaste@ -- for UEFI support in general and testing. 3. Everyone who tested the proposed for PR 191564. 4. jhb@ and kib@ for being a soundboard and applying a clue bat if so needed.
|
#
286584 |
|
10-Aug-2015 |
kib |
Make kstack_pages a tunable on arm, x86, and powepc. On i386, the initial thread stack is not adjusted by the tunable, the stack is allocated too early to get access to the kernel environment. See TD0_KSTACK_PAGES for the thread0 stack sizing on i386.
The tunable was tested on x86 only. From the visual inspection, it seems that it might work on arm and powerpc. The arm USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already incorrect for the threads with non-default kstack size. I only changed the macros to use variable instead of constant, since I cannot test.
On arm64, mips and sparc64, some static data structures are sized by KSTACK_PAGES, so the tunable is disabled.
Sponsored by: The FreeBSD Foundation MFC after: 2 week
|
#
285783 |
|
21-Jul-2015 |
jhb |
Various changes to the registers displayed in DDB for x86. - Fix segment registers to only display the low 16 bits. - Remove unused handlers and entries for the debug registers. - Display xcr0 (if valid) in 'show sysregs'. - Add '0x' prefix to MSR values to match other values in 'show sysregs'. - MFamd64: Display various MSRs in 'show sysregs'. - Add a 'show dbregs' to display the value of debug registers. - Dynamically size the column width for register values to properly align columns on 64-bit platforms. - Display %gs for i386 in 'show registers'.
Differential Revision: https://reviews.freebsd.org/D2784 Reviewed by: kib, markj MFC after: 2 weeks
|
#
283479 |
|
24-May-2015 |
dchagin |
The kernel sends signals to the processes via ABI specific sv_sendsig method. Native ABI do not need signal conversion, only emulators may want this. Usually emulators implements its own sv_sendsig method. For now only ibcs2 emulator does not have own sv_sendsig implementation and depends on native sendsig() method. So, remove any extra attempts to convert signal numbers from native sendsig() methods except from i386 where ibsc2 is living.
|
#
282684 |
|
09-May-2015 |
kib |
Rewrite amd64 PCID implementation to follow an algorithm described in the Vahalia' "Unix Internals" section 15.12 "Other TLB Consistency Algorithms". The same algorithm is already utilized by the MIPS pmap to handle ASIDs.
The PCID for the address space is now allocated per-cpu during context switch to the thread using pmap, when no PCID on the cpu was ever allocated, or the current PCID is invalidated. If the PCID is reused, bit 63 of %cr3 can be set to avoid TLB flush.
Each cpu has PCID' algorithm generation count, which is saved in the pmap pcpu block when pcpu PCID is allocated. On invalidation, the pmap generation count is zeroed, which signals the context switch code that already allocated PCID is no longer valid. The implication is the TLB shootdown for the given cpu/address space, due to the allocation of new PCID.
The pm_save mask is no longer has to be tracked, which (significantly) reduces the targets of the TLB shootdown IPIs. Previously, pm_save was reset only on pmap_invalidate_all(), which made it accumulate the cpuids of all processors on which the thread was scheduled between full TLB shootdowns.
Besides reducing the amount of TLB shootdowns and removing atomics to update pm_saves in the context switch code, the algorithm is much simpler than the maintanence of pm_save and selection of the right address space in the shootdown IPI handler.
Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
281851 |
|
22-Apr-2015 |
kib |
Move some common code from sys/amd64/amd64/machdep.c and sys/i386/i386/machdep.c to new file sys/x86/x86/cpu_machdep.c. Most of the code is related to the idle handling.
Discussed with: pluknet Sponsored by: The FreeBSD Foundation
|
#
281762 |
|
20-Apr-2015 |
kib |
Remove duplicate definitions of MWAIT_CX hints. Identical defines in specialreg.h are enough.
Discussed with: mav Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
280781 |
|
28-Mar-2015 |
kib |
Make it possible for the signal handler to act on #ss. Load the canonical user data segment' selector into %ss when calling the handler.
Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
278001 |
|
31-Jan-2015 |
kib |
Do not qualify the mcontext_t *mcp argument for set_mcontext(9) as const. On x86, even after the machine context is supposedly read into the struct ucontext, lazy FPU state save code might only mark the FPU data as hardware-owned. Later, set_fpcontext() needs to fetch the state from hardware, modifying the *mcp.
The set_mcontext(9) is called from sigreturn(2) and setcontext(2) implementations and old create_thread(2) interface, which throw the *mcp out after the set_mcontext() call.
Reported by: dim Discussed with: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
277735 |
|
26-Jan-2015 |
royger |
amd64: allow base memory segment to start at address different than 0
Current code requires that the first physical memory segment starts at 0, but this is not really needed. We only need to make sure the bootstrap code and page tables for APs are allocated below 4GB.
This patch removes this requirement and allows booting a Dell R710 from UEFI, where the first physical memory segment starts at 0x10000.
Sponsored by: Citrix Systems R&D Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D1417
|
#
277713 |
|
25-Jan-2015 |
jhb |
If the boot-time memory test is enabled, output a dot ('.') for each GB of RAM tested so people watching the console can see that the machine is making progress and not hung.
PR: 196650 Submitted by: Ravi Pokala <rpokala@panasas.com> Suggestions from: Eric van Gyzen <eric@vangyzen.net> MFC after: 2 weeks
|
#
276724 |
|
05-Jan-2015 |
jhb |
On some Intel CPUs with a P-state but not C-state invariant TSC the TSC may also halt in C2 and not just C3 (it seems that in some cases the BIOS advertises its C3 state as a C2 state in _CST). Just play it safe and disable both C2 and C3 states if a user forces the use of the TSC as the timecounter on such CPUs.
PR: 192316 Differential Revision: https://reviews.freebsd.org/D1441 No objection from: jkim MFC after: 1 week
|
#
273174 |
|
16-Oct-2014 |
davide |
Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols.
Submitted by: kmacy Tested by: make universe
|
#
272310 |
|
30-Sep-2014 |
royger |
msi: add Xen MSI implementation
This patch adds support for MSI interrupts when running on Xen. Apart from adding the Xen related code needed in order to register MSI interrupts this patch also makes the msi_init function a hook in init_ops, so different MSI implementations can have different initialization functions.
Sponsored by: Citrix Systems R&D
xen/interface/physdev.h: - Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen public interface.
x86/include/init.h: - Add a hook for setting custom msi_init methods.
amd64/amd64/machdep.c: i386/i386/machdep.c: - Set the default msi_init hook to point to the native MSI initialization method.
x86/xen/pv.c: - Set the Xen MSI init hook when running as a Xen guest.
x86/x86/local_apic.c: - Call the msi_init hook instead of directly calling msi_init.
xen/xen_intr.h: x86/xen/xen_intr.c: - Introduce support for registering/releasing MSI interrupts with Xen. - The MSI interrupts will use the same PIC as the IO APIC interrupts.
xen/xen_msi.h: x86/xen/xen_msi.c: - Introduce a Xen MSI implementation.
x86/xen/xen_nexus.c: - Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI implementation.
x86/xen/xen_pci.c: - Introduce a Xen specific PCI bus that inherits from the ACPI PCI bus and overwrites the native MSI methods. - This is needed because when running under Xen the MSI messages used to configure MSI interrupts on PCI devices are written by Xen itself.
dev/acpica/acpi_pci.c: - Lower the quality of the ACPI PCI bus so the newly introduced Xen PCI bus can take over when needed.
conf/files.i386: conf/files.amd64: - Add the newly created files to the build process.
|
#
272098 |
|
25-Sep-2014 |
royger |
ddb: allow specifying the exact address of the symtab and strtab
When the FreeBSD kernel is loaded from Xen the symtab and strtab are not loaded the same way as the native boot loader. This patch adds three new global variables to ddb that can be used to specify the exact position and size of those tables, so they can be directly used as parameters to db_add_symbol_table. A new helper is introduced, so callers that used to set ksym_start and ksym_end can use this helper to set the new variables.
It also adds support for loading them from the Xen PVH port, that was previously missing those tables.
Sponsored by: Citrix Systems R&D Reviewed by: kib
ddb/db_main.c: - Add three new global variables: ksymtab, kstrtab, ksymtab_size that can be used to specify the position and size of the symtab and strtab. - Use those new variables in db_init in order to call db_add_symbol_table. - Move the logic in db_init to db_fetch_symtab in order to set ksymtab, kstrtab, ksymtab_size from ksym_start and ksym_end.
ddb/ddb.h: - Add prototype for db_fetch_ksymtab. - Declate the extern variables ksymtab, kstrtab and ksymtab_size.
x86/xen/pv.c: - Add support for finding the symtab and strtab when booted as a Xen PVH guest. Since Xen loads the symtab and strtab as NetBSD expects to find them we have to adapt and use the same method.
amd64/amd64/machdep.c: arm/arm/machdep.c: i386/i386/machdep.c: mips/mips/machdep.c: pc98/pc98/machdep.c: powerpc/aim/machdep.c: powerpc/booke/machdep.c: sparc64/sparc64/machdep.c: - Use the newly introduced db_fetch_ksymtab in order to set ksymtab, kstrtab and ksymtab_size.
|
#
271495 |
|
13-Sep-2014 |
jhb |
Add a sysctl to export the EFI memory map along with a handler in the sysctl(8) binary to format it.
Reviewed by: emaste MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D771
|
#
271149 |
|
04-Sep-2014 |
pfg |
Apply known workarounds for modern MacBooks.
The legacy USB circuit tends to give trouble on MacBook. While the original report covered MacBook, extend the fix preemptively for the newer MacBookPro too.
PR: 191693 Reviewed by: emaste MFC after: 5 days
|
#
271076 |
|
04-Sep-2014 |
jhb |
- Move prototypes for various functions into out of C files and into <machine/md_var.h>. - Move some CPU-related variables out of i386/i386/identcpu.c to initcpu.c to match amd64. - Move the declaration of has_f00f_hack out of identcpu.c to machdep.c. - Remove a misleading comment from i386/i386/initcpu.c (locore zeros the BSS before it calls identify_cpu()) and remove explicit zero assignments to reduce the diff with amd64.
|
#
270828 |
|
29-Aug-2014 |
jhb |
- Add a new structure type for the ACPI 3.0 SMAP entry that includes the optional attributes field. - Add a 'machdep.smap' sysctl that exports the SMAP table of the running system as an array of the ACPI 3.0 structure. (On older systems, the attributes are given a value of zero.) Note that the sysctl only exports the SMAP table if it is available via the metadata passed from the loader to the kernel. If an SMAP is not available, an empty array is returned. - Add a format handler for the ACPI 3.0 SMAP structure to the sysctl(8) binary to format the SMAP structures in a readable format similar to the format found in boot messages.
MFC after: 2 weeks
|
#
268982 |
|
22-Jul-2014 |
emaste |
Don't pass null kmdp to preload_search_info
On Xen PVH guests kmdp == NULL.
Submitted by: royger MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
268471 |
|
09-Jul-2014 |
kib |
For safety, ensure that any consumer of the set_regs() and ptrace_set_pc() use the correct return to userspace using iret.
The signal return, PT_CONTINUE (which in fact uses signal return path) set the pcb flag already. The setcontext(2) enforces iret return when %rip is incorrect. Due to this, the change is redundand, but is made to ensure that no path which modifies context, forgets to set PCB_FULL_IRET.
Inspired by: CVE-2014-4699 Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
268158 |
|
02-Jul-2014 |
emaste |
Prefer vt(4) for UEFI boot
The UEFI framebuffer driver vt_efifb requires vt(4), so add a mechanism for the startup routine to set the preferred console. This change is ugly because console init happens very early in the boot, making a cleaner interface difficult. This change is intended only to facilitate the sc(4) / vt(4) transition, and can be reverted once vt(4) is the default.
|
#
267992 |
|
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
267985 |
|
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
267961 |
|
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
266093 |
|
14-May-2014 |
neel |
Increase the TSS limit by one byte. The processor requires an additional byte with all bits set to 1 beyond the I/O permission bitmap.
Prior to this change accessing I/O ports [0xFFF8-0xFFFF] would trigger a #GP fault even though the I/O bitmap allowed access to those ports.
For more details see section "I/O Permission Bit Map" in the Intel SDM, Vol 1.
Reviewed by: kib
|
#
265014 |
|
27-Apr-2014 |
emaste |
Report boot method (BIOS/UEFI) via sysctl machdep.bootmethod
Sponsored by: The FreeBSD Foundation
|
#
263822 |
|
27-Mar-2014 |
emaste |
amd64: Parse the EFI memory map if present
With this change (and loader.efi from the projects/uefi branch) we can now boot under qemu using the OVMF UEFI firmware image with the limitation that a serial console is required.
(This is largely r246337 from the projects/uefi branch.)
Sponsored by: The FreeBSD Foundation
|
#
263620 |
|
22-Mar-2014 |
bdrewery |
Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@ Sponsored by: EMC / Isilon Storage Division
|
#
263152 |
|
14-Mar-2014 |
glebius |
Remove AppleTalk support.
AppleTalk was a network transport protocol for Apple Macintosh devices in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was a legacy protocol and primary networking protocol is TCP/IP. The last Mac OS X release to support AppleTalk happened in 2009. The same year routing equipment vendors (namely Cisco) end their support.
Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
|
#
263140 |
|
14-Mar-2014 |
glebius |
Remove IPX support.
IPX was a network transport protocol in Novell's NetWare network operating system from late 80s and then 90s. The NetWare itself switched to TCP/IP as default transport in 1998. Later, in this century the Novell Open Enterprise Server became successor of Novell NetWare. The last release that claimed to still support IPX was OES 2 in 2007. Routing equipment vendors (e.g. Cisco) discontinued support for IPX in 2011.
Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
|
#
263014 |
|
11-Mar-2014 |
royger |
xen: add a hook to perform AP startup
AP startup on PVH follows the PV method, so we need to add a hook in order to diverge from bare metal.
Approved by: gibbs Sponsored by: Citrix Systems R&D
amd64/amd64/machdep.c: - Add hook for start_all_aps on native (using native_start_all_aps defined in mp_machdep).
amd64/amd64/mp_machdep.c: - Make some variables global because they will also be used by the Xen PVH AP startup code. - Use the start_all_aps hook to start APs. - Rename start_all_aps to native_start_all_aps.
amd64/include/smp.h: - Add declaration for native_start_all_aps.
x86/include/init.h: - Declare start_all_aps hook in init_ops.
x86/xen/pv.c: - Pick external declarations from mp_machdep. - Introduce Xen PV code to start APs on PVH. - Set start_all_aps init hook to use the Xen PVH implementation.
|
#
263012 |
|
11-Mar-2014 |
royger |
xen: add hook for AP bootstrap memory reservation
This hook will only be implemented for bare metal, Xen doesn't require any bootstrap code since APs are started in long mode with paging enabled.
Approved by: gibbs Sponsored by: Citrix Systems R&D
amd64/amd64/machdep.c: - Set mp_bootaddress hook for bare metal.
x86/include/init.h: - Define mp_bootaddress in init_ops.
|
#
263009 |
|
11-Mar-2014 |
royger |
xen: implement hook to fetch and parse e820 memory map
e820 memory map is fetched using a hypercall under Xen PVH, so add a hook to init_ops in oder to diverge from bare metal and implement a Xen variant.
Approved by: gibbs Sponsored by: Citrix Systems R&D
x86/include/init.h: - Add a parse_memmap hook to init_ops, that will be called to fetch and parse the memory map.
amd64/amd64/machdep.c: - Decouple the fetch and the parse of the memmap, so the parse function can be shared with Xen code. - Move code around in order to implement the parse_memmap hook.
amd64/include/pc/bios.h: - Declare bios_add_smap_entries (implemented in machdep.c).
x86/xen/pv.c: - Implement fetching of e820 memmap when running as a PVH guest by using the XENMEM_memory_map hypercall.
|
#
263008 |
|
11-Mar-2014 |
royger |
xen: implement an early timer for Xen PVH
When running as a PVH guest, there's no emulated i8254, so we need to use the Xen PV timer as the early source for DELAY. This change allows for different implementations of the early DELAY function and implements a Xen variant for it.
Approved by: gibbs Sponsored by: Citrix Systems R&D
dev/xen/timer/timer.c: dev/xen/timer/timer.h: - Implement Xen early delay functions using the PV timer and declare them.
x86/include/init.h: - Add hooks for early clock source initialization and early delay functions.
i386/i386/machdep.c: pc98/pc98/machdep.c: amd64/amd64/machdep.c: - Set early delay hooks to use the i8254 on bare metal. - Use clock_init (that will in turn make use of init_ops) to initialize the early clock source.
amd64/include/clock.h: i386/include/clock.h: - Declare i8254_delay and clock_init.
i386/xen/clock.c: - Rename DELAY to i8254_delay.
x86/isa/clock.c: - Introduce clock_init that will take care of initializing the early clock by making use of the init_ops hooks. - Move non ISA related delay functions to the newly introduced delay file.
x86/x86/delay.c: - Add moved delay related functions. - Implement generic DELAY function that will use the init_ops hooks.
x86/xen/pv.c: - Set PVH hooks for the early delay related functions in init_ops.
conf/files.amd64: conf/files.i386: conf/files.pc98: - Add delay.c to the kernel build.
|
#
263006 |
|
11-Mar-2014 |
royger |
amd64: introduce hook for custom preload metadata parsers
Add hooks to amd64 in order to have diverging implementations, since on Xen PV the metadata is passed to the kernel in a different form.
Approbed by: gibbs Sponsored by: Citrix Systems R&D
amd64/amd64/machdep.c: - Define init_ops for native. - Put native code inside of native_parse_preload_data hook. - Call the parse_preload_data in order to fill the metadata info.
x86/include/init.h: - Declare the init_ops struct.
x86/xen/pv.c: - Declare xen_init_ops that contains the Xen PV implementation of init_ops. - Implement the parse_preload_data for Xen PVH, the info is fetched from HYPERVISOR_start_info->cmd_line as provided by Xen.
|
#
261087 |
|
23-Jan-2014 |
jhb |
Move <machine/apicvar.h> to <x86/apicvar.h>.
|
#
259782 |
|
23-Dec-2013 |
jhb |
Add a resume hook for bhyve that runs a function on all CPUs during resume. For Intel CPUs, invoke vmxon for CPUs that were in VMX mode at the time of suspend.
Reviewed by: neel
|
#
259015 |
|
05-Dec-2013 |
jhb |
Fix a typo.
|
#
258541 |
|
25-Nov-2013 |
attilio |
- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0].
[0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1].
Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip
|
#
258471 |
|
22-Nov-2013 |
emaste |
Don't abort SMAP processing after an entry of length 0
Length 0 is not special and should just be skipped. This is the same behaviour as i386.
Discussed with: jhb@ Sponsored by: The FreeBSD Foundation
|
#
258436 |
|
21-Nov-2013 |
emaste |
Refactor amd64 startup SMAP parsing
Extracted from the projects/uefi branch, this change is a reasonable cleanup and will reduce the diffs to review when bringing in the UEFI work.
Reviewed by: kib@ Sponsored by: The FreeBSD Foundation
|
#
258431 |
|
21-Nov-2013 |
emaste |
Disable amd64 boot time memory test by default
The page presence memory test takes a long time on large memory systems and has little value on contemporary amd64 hardware.
Sponsored by: The FreeBSD Foundation
|
#
258176 |
|
15-Nov-2013 |
gibbs |
Fix accounting for hw.realmem on the i386 and amd64 platforms.
sys/i386/i386/machdep.c: sys/amd64/amd64/machdep.c: The value reported by FreeBSD as "real memory" when booting doesn't match what is later reported by sysctl as hw.realmem. This is due to the fact that the value printed during the boot process is fetched from smbios data (when possible), and accounts for holes in physical memory. On the other hand, the value of hw.realmem is unconditionally set to be one larger than the highest page of the physical address space.
Fix this by setting hw.realmem to the same value printed during boot, this makes hw.realmem honour it's name and account properly for physical memory present in the system.
Submitted by: Roger Pau Monné Reviewed by: gibbs
|
#
258135 |
|
14-Nov-2013 |
emaste |
x86: Allow users to change PSL_RF via ptrace(PT_SETREGS...)
Debuggers may need to change PSL_RF. Note that tf_eflags is already stored in the signal context during signal handling and PSL_RF previously could be modified via sigreturn, so this change should not provide any new ability to userspace.
For background see the thread at: http://lists.freebsd.org/pipermail/freebsd-i386/2007-September/005910.html
Reviewed by: jhb, kib Sponsored by: DARPA, AFRL
|
#
256072 |
|
05-Oct-2013 |
neel |
Merge projects/bhyve_npt_pmap into head.
Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory.
Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT.
pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT.
The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest.
Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'.
An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR)
The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@).
The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap.
TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB.
Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose.
PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them.
Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable.
Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement.
Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions.
Special thanks to Peter Holm for testing the patch on short notice.
Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho
|
#
255060 |
|
30-Aug-2013 |
kib |
Implement support for the process-context identifiers ('PCID') on Intel CPUs. The feature tags TLB entries with the Id of the address space and allows to avoid TLB invalidation on the context switch, it is available only in the long mode. In the microbenchmarks, using the PCID decreased latency of the context switches by ~30% on SandyBridge class desktop CPUs, measured with the lat_ctx program from lmbench.
If available, use INVPCID instruction when a TLB entry in non-current address space needs to be invalidated. The instruction is typically available on the Haswell.
If needed, the use of PCID can be turned off with the vm.pmap.pcid_enabled loader tunable set to 0. The state of the feature is reported by the vm.pmap.pcid_enabled sysctl. The sysctl vm.pmap.pcid_save_cnt reports the number of context switches which avoided invalidating the TLB; compare with the total number of context switches, available as sysctl vm.stats.sys.v_swtch.
Sponsored by: The FreeBSD Foundation Reviewed by: alc Tested by: pho, bf
|
#
255040 |
|
29-Aug-2013 |
gibbs |
Implement vector callback for PVHVM and unify event channel implementations
Re-structure Xen HVM support so that: - Xen is detected and hypercalls can be performed very early in system startup. - Xen interrupt services are implemented using FreeBSD's native interrupt delivery infrastructure. - the Xen interrupt service implementation is shared between PV and HVM guests. - Xen interrupt handlers can optionally use a filter handler in order to avoid the overhead of dispatch to an interrupt thread. - interrupt load can be distributed among all available CPUs. - the overhead of accessing the emulated local and I/O apics on HVM is removed for event channel port events. - a similar optimization can eventually, and fairly easily, be used to optimize MSI.
Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure, and misc Xen cleanups:
Sponsored by: Spectra Logic Corporation
Unification of PV & HVM interrupt infrastructure, bug fixes, and misc Xen cleanups:
Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D
sys/x86/x86/local_apic.c: sys/amd64/include/apicvar.h: sys/i386/include/apicvar.h: sys/amd64/amd64/apic_vector.S: sys/i386/i386/apic_vector.s: sys/amd64/amd64/machdep.c: sys/i386/i386/machdep.c: sys/i386/xen/exception.s: sys/x86/include/segments.h: Reserve IDT vector 0x93 for the Xen event channel upcall interrupt handler. On Hypervisors that support the direct vector callback feature, we can request that this vector be called directly by an injected HVM interrupt event, instead of a simulated PCI interrupt on the Xen platform PCI device. This avoids all of the overhead of dealing with the emulated I/O APIC and local APIC. It also means that the Hypervisor can inject these events on any CPU, allowing upcalls for different ports to be handled in parallel.
sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: Map Xen per-vcpu area during AP startup.
sys/amd64/include/intr_machdep.h: sys/i386/include/intr_machdep.h: Increase the FreeBSD IRQ vector table to include space for event channel interrupt sources.
sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: Remove Xen HVM per-cpu variable data. These fields are now allocated via the dynamic per-cpu scheme. See xen_intr.c for details.
sys/amd64/include/xen/hypercall.h: sys/dev/xen/blkback/blkback.c: sys/i386/include/xen/xenvar.h: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/xen/gnttab.c: Prefer FreeBSD primatives to Linux ones in Xen support code.
sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: sys/dev/xen/balloon/balloon.c: sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/console/xencons_ring.c: sys/dev/xen/control/control.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/xenpci/xenpci.c: sys/i386/i386/machdep.c: sys/i386/include/pmap.h: sys/i386/include/xen/xenfunc.h: sys/i386/isa/npx.c: sys/i386/xen/clock.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/mptable.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/xen_rtc.c: sys/xen/evtchn/evtchn_dev.c: sys/xen/features.c: sys/xen/gnttab.c: sys/xen/gnttab.h: sys/xen/hvm.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstore.c: sys/xen/xenstore/xenstore_dev.c: sys/xen/xenstore/xenstorevar.h: Pull common Xen OS support functions/settings into xen/xen-os.h.
sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: Remove constants, macros, and functions unused in FreeBSD's Xen support.
sys/xen/xen-os.h: sys/i386/xen/xen_machdep.c: sys/x86/xen/hvm.c: Introduce new functions xen_domain(), xen_pv_domain(), and xen_hvm_domain(). These are used in favor of #ifdefs so that FreeBSD can dynamically detect and adapt to the presence of a hypervisor. The goal is to have an HVM optimized GENERIC, but more is necessary before this is possible.
sys/amd64/amd64/machdep.c: sys/dev/xen/xenpci/xenpcivar.h: sys/dev/xen/xenpci/xenpci.c: sys/x86/xen/hvm.c: sys/sys/kernel.h: Refactor magic ioport, Hypercall table and Hypervisor shared information page setup, and move it to a dedicated HVM support module.
HVM mode initialization is now triggered during the SI_SUB_HYPERVISOR phase of system startup. This currently occurs just after the kernel VM is fully setup which is just enough infrastructure to allow the hypercall table and shared info page to be properly mapped.
sys/xen/hvm.h: sys/x86/xen/hvm.c: Add definitions and a method for configuring Hypervisor event delievery via a direct vector callback.
sys/amd64/include/xen/xen-os.h: sys/x86/xen/hvm.c:
sys/conf/files: sys/conf/files.amd64: sys/conf/files.i386: Adjust kernel build to reflect the refactoring of early Xen startup code and Xen interrupt services.
sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: sys/dev/xen/control/control.c: sys/dev/xen/evtchn/evtchn_dev.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/xen/xenstore/xenstore.c: sys/xen/evtchn/evtchn_dev.c: sys/dev/xen/console/console.c: sys/dev/xen/console/xencons_ring.c Adjust drivers to use new xen_intr_*() API.
sys/dev/xen/blkback/blkback.c: Since blkback defers all event handling to a taskqueue, convert this task queue to a "fast" taskqueue, and schedule it via an interrupt filter. This avoids an unnecessary ithread context switch.
sys/xen/xenstore/xenstore.c: The xenstore driver is MPSAFE. Indicate as much when registering its interrupt handler.
sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusvar.h: Remove unused event channel APIs.
sys/xen/evtchn.h: Remove all kernel Xen interrupt service API definitions from this file. It is now only used for structure and ioctl definitions related to the event channel userland device driver.
Update the definitions in this file to match those from NetBSD. Implementing this interface will be necessary for Dom0 support.
sys/xen/evtchn/evtchnvar.h: Add a header file for implemenation internal APIs related to managing event channels event delivery. This is used to allow, for example, the event channel userland device driver to access low-level routines that typical kernel consumers of event channel services should never access.
sys/xen/interface/event_channel.h: sys/xen/xen_intr.h: Standardize on the evtchn_port_t type for referring to an event channel port id. In order to prevent low-level event channel APIs from leaking to kernel consumers who should not have access to this data, the type is defined twice: Once in the Xen provided event_channel.h, and again in xen/xen_intr.h. The double declaration is protected by __XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared twice within a given compilation unit.
sys/xen/xen_intr.h: sys/xen/evtchn/evtchn.c: sys/x86/xen/xen_intr.c: sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/xenpci/xenpcivar.h: New implementation of Xen interrupt services. This is similar in many respects to the i386 PV implementation with the exception that events for bound to event channel ports (i.e. not IPI, virtual IRQ, or physical IRQ) are further optimized to avoid mask/unmask operations that aren't necessary for these edge triggered events.
Stubs exist for supporting physical IRQ binding, but will need additional work before this implementation can be fully shared between PV and HVM.
sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: sys/i386/xen/mp_machdep.c sys/x86/xen/hvm.c: Add support for placing vcpu_info into an arbritary memory page instead of using HYPERVISOR_shared_info->vcpu_info. This allows the creation of domains with more than 32 vcpus.
sys/i386/i386/machdep.c: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/exception.s: Add support for new event channle implementation.
|
#
253352 |
|
15-Jul-2013 |
kib |
MFi386: add ddb "show sysregs" command.
Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
251039 |
|
27-May-2013 |
kib |
Use slightly more idiomatic expression to get the address of array.
Tested by: dim, pgj Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
250840 |
|
21-May-2013 |
marcel |
Add basic support for FDT to i386 & amd64. This change includes: 1. Common headers for fdt.h and ofw_machdep.h under x86/include with indirections under i386/include and amd64/include. 2. New modinfo for loader provided FDT blob. 3. Common x86_init_fdt() called from hammer_time() on amd64 and init386() on i386. 4. Split-off FDT specific low-level console functions from FDT bus methods for the uart(4) driver. The low-level console logic has been moved to uart_cpu_fdt.c and is used for arm, mips & powerpc only. The FDT bus methods are shared across all architectures. 5. Add dev/fdt/fdt_x86.c to hold the fdt_fixup_table[] and the fdt_pic_table[] arrays. Both are empty right now.
FDT addresses are I/O ports on x86. Since the core FDT code does not handle different address spaces, adding support for both I/O ports and memory addresses requires some thought and discussion. It may be better to use a compile-time option that controls this.
Obtained from: Juniper Networks, Inc.
|
#
250423 |
|
09-May-2013 |
dchagin |
Retire write-only PCB_GS32BIT pcb flag on amd64.
|
#
248084 |
|
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
247454 |
|
28-Feb-2013 |
davide |
MFcalloutng: When CPU becomes idle, cpu_idleclock() calculates time to the next timer event in order to reprogram hw timer. Return that time in sbintime_t to the caller and pass it to acpi_cpu_idle(), where it can be used as one more factor (quite precise) to extimate furter sleep time and choose optimal sleep state. This is a preparatory change for further callout improvements will be committed in the next days.
The commmit is not targeted for MFC.
|
#
241880 |
|
22-Oct-2012 |
eadler |
The 'testing memory' patch gets printed too many times
Approved by: cperciva (implicit)
|
#
241850 |
|
22-Oct-2012 |
eadler |
Explain the upcoming delay by printing a message when the kernel is about to begin testing memory.
Reviewed by: dteske, adri Approved by: cperciva MFC after: 1 week
|
#
241371 |
|
09-Oct-2012 |
attilio |
Reverts r234074,234105,234564,234723,234989,235231-235232 and part of r234247. Use, instead, the static intializer introduced in r239923 for x86 and sparc64 intr_cpus, unwinding the code to the initial version.
Reviewed by: marius
|
#
238623 |
|
19-Jul-2012 |
kib |
Introduce curpcb magic variable, similar to curthread, which is MD amd64. It is implemented as __pure2 inline with non-volatile asm read from pcpu, which allows a compiler to cache its results.
Convert most PCPU_GET(pcb) and curthread->td_pcb accesses into curpcb.
Note that __curthread() uses magic value 0 as an offsetof(struct pcpu, pc_curthread). It seems to be done this way due to machine/pcpu.h needs to be processed before sys/pcpu.h, because machine/pcpu.h contributes machine-depended fields to the struct pcpu definition. As result, machine/pcpu.h cannot use struct pcpu yet.
The __curpcb() also uses a magic constant instead of offsetof(struct pcpu, pc_curpcb) for the same reason. The constants are now defined as symbols and CTASSERTs are added to ensure that future KBI changes do not break the code.
Requested and reviewed by: bde MFC after: 3 weeks
|
#
238310 |
|
09-Jul-2012 |
jhb |
Partially revert r217515 so that the mem_range_softc variable is always present on x86 kernels. This fixes the build of kernels that include 'device acpi' but do not include 'device mem'.
MFC after: 1 month
|
#
234723 |
|
26-Apr-2012 |
attilio |
Clean up the intr* MD KPI from the SMP dependency, removing a cause of discrepancy between modules and kernel, but deal with SMP differences within the functions themselves.
As an added bonus this also helps in terms of code readability.
Requested by: gibbs Reviewed by: jhb, marius MFC after: 1 week
|
#
234105 |
|
10-Apr-2012 |
marius |
Fix !SMP build after r234074.
Reviewed by: attilio, jhb
|
#
234074 |
|
09-Apr-2012 |
attilio |
BSP is not added to the mask of valid target CPUs for interrupts in set_apic_interrupt_ids(). Besides, set_apic_interrupts_ids() is not called in the !SMP case too. Fix this by: - Adding the BSP as an interrupt target directly in cpu_startup(). - Remove an obsolete optimization where the BSP are skipped in set_apic_interrupt_ids().
Reported by: jh Reviewed by: jhb MFC after: 3 days X-MFC: r233961 Pointy hat to: me
|
#
231781 |
|
15-Feb-2012 |
jkim |
Some BIOSes are known for corrupting low 64KB between suspend and resume. Mask off the first 16 pages unless we appear to be running in a VM. This address may be overridden by 'hw.physmem.start' tunable from loader. Note Linux used to have a BIOS quirk table for this issue but it seems they made it default recently.
|
#
230426 |
|
21-Jan-2012 |
kib |
Add support for the extended FPU states on amd64, both for native 64bit and 32bit ABIs. As a side-effect, it enables AVX on capable CPUs.
In particular:
- Query the CPU support for XSAVE, list of the supported extensions and the required size of FPU save area. The hw.use_xsave tunable is provided for disabling XSAVE, and hw.xsave_mask may be used to select the enabled extensions.
- Remove the FPU save area from PCB and dynamically allocate the (run-time sized) user save area on the top of the kernel stack, right above the PCB. Reorganize the thread0 PCB initialization to postpone it after BSP is queried for save area size.
- The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as well. FPU state is only useful for suspend, where it is saved in dynamically allocated suspfpusave area.
- Use XSAVE and XRSTOR to save/restore FPU state, if supported and enabled.
- Define new mcontext_t flag _MC_HASFPXSTATE, indicating that mcontext_t has a valid pointer to out-of-struct extended FPU state. Signal handlers are supplied with stack-allocated fpu state. The sigreturn(2) and setcontext(2) syscall honour the flag, allowing the signal handlers to inspect and manipilate extended state in the interrupted context.
- The getcontext(2) never returns extended state, since there is no place in the fixed-sized mcontext_t to place variable-sized save area. And, since mcontext_t is embedded into ucontext_t, makes it impossible to fix in a reasonable way. Instead of extending getcontext(2) syscall, provide a sysarch(2) facility to query extended FPU state.
- Add ptrace(2) support for getting and setting extended state; while there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries.
- Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to consumers, making it opaque. Internally, struct fpu_kern_ctx now contains a space for the extended state. Convert in-kernel consumers of fpu_kern KPI both on i386 and amd64.
First version of the support for AVX was submitted by Tim Bird <tim.bird am sony com> on behalf of Sony. This version was written from scratch.
Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org> MFC after: 1 month
|
#
229085 |
|
31-Dec-2011 |
gavin |
Default to not performing the early-boot memory tests when we detect we are booting inside a VM. There are three reasons to disable this:
o It causes the VM host to believe that all the tested pages or RAM are in use. This in turn may force the host to page out pages of RAM belonging to other VMs, or otherwise cause problems with fair resource sharing on the VM cluster. o It adds significant time to the boot process (around 1 second/Gig in testing) o It is unnecessary - the host should have already verified that the memory is functional etc.
Note that this simply changes the default when in a VM - it can still be overridden using the hw.memtest.tests tunable.
MFC after: 4 weeks
|
#
227442 |
|
11-Nov-2011 |
kib |
Weaken the part of assertions added in the r227394. Only check that the process state is stopped.
MFC after: 1 week
|
#
227394 |
|
09-Nov-2011 |
kib |
Stopped process may legitimately have some threads sleeping and not suspended, if the sleep is uninterruptible.
Reported and tested by: pho MFC after: 1 week
|
#
225936 |
|
03-Oct-2011 |
attilio |
Add some improvements in the idle table callbacks: - Replace instances of manual assembly instruction "hlt" call with halt() function calling. - In cpu_idle_mwait() avoid races in check to sched_runnable() using the same pattern used in cpu_idle_hlt() with the 'hlt' instruction. - Add comments explaining the logic behind the pattern used in cpu_idle_hlt() and other idle callbacks.
In collabouration with: jhb, mav Reviewed by: adri, kib MFC after: 3 weeks
|
#
225617 |
|
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
#
225048 |
|
20-Aug-2011 |
bz |
In HEAD when doing no further checkes there is no reason use the temporary variable and check with if as TUNABLE_*_FETCH do not alter values unless successfully found the tunable.
Reported by: jhb, bde MFC after: 3 days X-MFC with: r224516 Approved by: re (kib)
|
#
224516 |
|
30-Jul-2011 |
bz |
Introduce a tunable to disable the time consuming parts of bootup memtesting, which can easily save seconds to minutes of boot time. The tunable name is kept general to allow reusing the code in alternate frameworks.
Requested by: many Discussed on: arch (a while a go) Obtained from: Sandvine Incorporated Reviewed by: sbruno Approved by: re (kib) MFC after: 2 weeks
|
#
222853 |
|
08-Jun-2011 |
avg |
remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default scheduler. It may have been broken for SCHED_4BSD in more subtle ways, e.g. with manually configured CPU affinities and for interrupt devilery purposes. We still provide a way to disable individual CPUs or all hyperthreading "twin" CPUs before SMP startup. See the UPDATING entry for details.
Interaction between building CPU topology and disabling CPUs still remains fuzzy: topology is first built using all availble CPUs and then the disabled CPUs should be "subtracted" from it. That doesn't work well if the resulting topology becomes non-uniform.
This work is done in cooperation with Attilio Rao who in addition to reviewing also provided parts of code.
PR: kern/145385 Discussed with: gcooper, ambrisko, mdf, sbruno Reviewed by: attilio Tested by: pho, pluknet X-MFC after: never
|
#
221784 |
|
11-May-2011 |
dchagin |
Remove wrong comment.
MFC after: 1 week.
|
#
220584 |
|
12-Apr-2011 |
jkim |
Reduce errors in effective frequency calculation.
|
#
220583 |
|
12-Apr-2011 |
jkim |
Reinstate cpu_est_clockrate() support for P-state invariant TSC if APERF and MPERF MSRs are available. It was disabled in r216443. Remove the earlier hack to subtract 0.5% from the calibrated frequency as DELAY(9) is little bit more reliable now.
|
#
220433 |
|
07-Apr-2011 |
jkim |
Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).
|
#
220090 |
|
28-Mar-2011 |
alc |
The new binutils has correctly redefined MAXPAGESIZE on amd64 as 0x200000 instead of 0x100000. As a side effect, an amd64 kernel now loads at physical address 0x200000 instead of 0x100000. This is probably for the best because it avoids the use of a 2MB page mapping for the first 1MB of the kernel that also spans the fixed MTRRs. However, getmemsize() still thinks that the kernel loads at 0x100000, and so the physical memory between 0x100000 and 0x200000 is lost. Fix this problem by replacing the hard-wired constant in getmemsize() by a symbol "kernphys" that is defined by the linker script.
In collaboration with: kib
|
#
219523 |
|
11-Mar-2011 |
mdf |
Mostly revert r219468, as I had misremembered the C standard regarding the size of an extern array.
Keep one change from strncpy to strlcpy.
|
#
219473 |
|
10-Mar-2011 |
jkim |
Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as a CPU ticker. Note tsc_present does not change by this tunable.
|
#
219468 |
|
10-Mar-2011 |
mdf |
Use MAXPATHLEN rather than the size of an extern array when copying the kernel name. Also consistenly use strlcpy().
Suggested by: Warner Losh
|
#
218744 |
|
16-Feb-2011 |
dchagin |
To avoid excessive code duplication create wrapper for fill regs from stack frame. Change the trap() code to use newly created function instead of explicit regs assignment.
|
#
218327 |
|
05-Feb-2011 |
kib |
Clear the padding when returning context to the usermode, for MI ucontext_t and x86 MD parts. Kernel allocates the structures on the stack, and not clearing reserved fields and paddings causes leakage.
Noted and discussed with: bde MFC after: 2 weeks
|
#
217886 |
|
26-Jan-2011 |
mdf |
Set td_kstack_pages for thread0. This was already being done for most architectures, but i386 and amd64 were missing it.
Submitted by: Mohd Fahadullah <mfahadullah AT isilon DOT com>
|
#
217688 |
|
21-Jan-2011 |
pluknet |
Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
|
#
217515 |
|
17-Jan-2011 |
jkim |
Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set(). Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
|
#
217151 |
|
08-Jan-2011 |
kib |
Create shared (readonly) page. Each ABI may specify the use of page by setting SV_SHP flag and providing pointer to the vm object and mapping address. Provide simple allocator to carve space in the page, tailored to put the code with alignment restrictions.
Enable shared page use for amd64, both native and 32bit FreeBSD binaries. Page is private mapped at the top of the user address space, moving a start of the stack one page down. Move signal trampoline code from the top of the stack to the shared page.
Reviewed by: alc
|
#
216634 |
|
21-Dec-2010 |
jkim |
Improve PCB flags handling and make it more robust. Add two new functions for manipulating pcb_flags. These inline functions are very similar to atomic_set_char(9) and atomic_clear_char(9) but without unnecessary LOCK prefix for SMP. Add comments about the rationale[1]. Use these functions wherever possible. Although there are some places where it is not strictly necessary (e.g., a PCB is copied to create a new PCB), it is done across the board for sake of consistency. Turn pcb_full_iret into a PCB flag as it is safe now. Move rarely used fields before pcb_flags and reduce size of pcb_flags to one byte. Fix some style(9) nits in pcb.h while I am in the neighborhood.
Reviewed by: kib Submitted by: kib[1] MFC after: 2 months
|
#
216443 |
|
14-Dec-2010 |
jkim |
Stop lying about supporting cpu_est_clockrate() when TSC is invariant. This function always returned the nominal frequency instead of current frequency because we use RDTSC instruction to calculate difference in CPU ticks, which is supposedly constant for the case. Now we support cpu_get_nominal_mhz() for the case, instead. Note it should be just enough for most usage cases because cpu_est_clockrate() is often times abused to find maximum frequency of the processor.
|
#
216312 |
|
08-Dec-2010 |
jkim |
Do not subtract 0.5% from estimated frequency if DELAY(9) is driven by TSC. Remove a confusing comment about converting to MHz as we never did.
|
#
216255 |
|
07-Dec-2010 |
kib |
Update some comments related to use of amd64 full context switch. In exec_linux_setregs(), use locally cached pointer to pcb to set pcb_full_iret. In set_regs(), note that full return is needed when code that sets segment registers is enabled.
MFC after: 1 week
|
#
216253 |
|
07-Dec-2010 |
kib |
Retire write-only PCB_FULLCTX pcb flag on amd64.
Reminded by: Petr Salinger <Petr.Salinger seznam cz> Tested by: pho MFC after: 1 week
|
#
216231 |
|
06-Dec-2010 |
kib |
Do not leak %rdx value in the previous image to the new image after execve(2). Note that ia32 binaries already handle this properly, since ia32_setregs() resets td_retval[1], but not exec_setregs().
We still do not conform to the amd64 ABI specification, since %rsp on the image startup is not aligned to 16 bytes.
PR: amd64/124134 Discussed with: Petr Salinger <Petr.Salinger seznam cz> (who convinced me that there is indeed several bugs) MFC after: 1 week
|
#
216012 |
|
28-Nov-2010 |
kib |
Calling fill_fpregs() for curthread is legitimate, and ELF coredump does this.
Reported and tested by: pho MFC after: 5 days
|
#
215865 |
|
26-Nov-2010 |
kib |
Remove npxgetregs(), npxsetregs(), fpugetregs() and fpusetregs() functions, they are unused. Remove 'user' from npxgetuserregs() etc. names.
For {npx,fpu}{get,set}regs(), always use pcb->pcb_user_save for FPU context storage. This eliminates the need for ugly copying with overwrite of the newly added and reserved fields in ucontext on i386 to satisfy alignment requirements for fpusave() and fpurstor().
pc98 version was copied from i386.
Suggested and reviewed by: bde Tested by: pho (i386 and amd64) MFC after: 1 week
|
#
214835 |
|
05-Nov-2010 |
jhb |
Adjust the order of operations in spinlock_enter() and spinlock_exit() to work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers.
In cooperation with: bde MFC after: 1 month
|
#
214630 |
|
01-Nov-2010 |
jhb |
Move the <machine/mca.h> header to <x86/mca.h>.
|
#
213748 |
|
12-Oct-2010 |
jkim |
Remove trailing ", " from `sysctl machdep.idle_available' output.
|
#
213382 |
|
03-Oct-2010 |
kib |
The makectx() function, used by kdb_trap() to reconstruct pcb from trap frame when trap initiated kdb entry, incorrectly calculated the value of %rsp for trapped thread.
According to Intel(R) 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1, rev. 035, 6.14.2 64-Bit Mode Stack Frame, "64-bit mode ... pushes SS:RSP unconditionally, rather than only on a CPL change." Even assuming the conditional push of the %ss:%rsp, the calculation was still wrong because sizeof(tf_ss) + sizeof(tf_rsp) == 16 on amd64.
Always use the tf_rsp from trap frame. The change supposedly fixes stepping when using kgdb backend for kdb.
Submitted by: Zhouyi Zhou <zhouzhouyi gmail com> PR: amd64/151167 Reviewed by: avg MFC after: 1 week
|
#
212541 |
|
13-Sep-2010 |
mav |
Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
|
#
211924 |
|
28-Aug-2010 |
rpaulo |
Register an interrupt vector for DTrace return probes. There is some code missing in lapic to make sure that we don't overwrite this entry, but this will be done on a sequent commit.
Sponsored by: The FreeBSD Foundation
|
#
209613 |
|
30-Jun-2010 |
jhb |
Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
|
#
209463 |
|
23-Jun-2010 |
kib |
Fix bugs on pc98, use npxgetuserregs() instead of npxgetregs() for get_fpcontext(), and npxsetuserregs() for set_fpcontext). Also, note that usercontext is not initialized anymore in fpstate_drop().
Systematically replace references to npxgetregs() and npxsetregs() by npxgetuserregs() and npxsetuserregs() in comments.
Noted by: bde
|
#
209208 |
|
15-Jun-2010 |
kib |
Remove two obsoleted comments, add a note about 32bit compatibility.
MFC after: 1 month
|
#
209198 |
|
15-Jun-2010 |
kib |
Use critical sections instead of disabling local interrupts to ensure the consistency between PCPU fpcurthread and the state of the FPU.
Explicitely assert that the calling conventions for fpudrop() are adhered too. In cpu_thread_exit(), add missed critical section entrance.
Reviewed by: bde Tested by: pho MFC after: 1 month
|
#
208833 |
|
05-Jun-2010 |
kib |
Introduce the x86 kernel interfaces to allow kernel code to use FPU/SSE hardware. Caller should provide a save area that is chained into the stack of the areas; pcb save_area for usermode FPU state is on top. The pcb now contains a pointer to the current FPU saved area, used during FPUDNA handling and context switches. There is also a facility to allow the kernel thread to use pcb save_area.
Change the dreaded warnings "npxdna in kernel mode!" into the panics when FPU usage is not registered.
KPI discussed with: fabient Tested by: pho, fabient Hardware provided by: Sentex Communications MFC after: 1 month
|
#
208621 |
|
28-May-2010 |
jhb |
Defer initializing machine checks for the boot CPU until the local APIC is fully configured.
MFC after: 1 month
|
#
206553 |
|
13-Apr-2010 |
kib |
Change printf() calls to uprintf() for sigreturn() and trap() complaints about inacessible or wrong mcontext, and for dreaded "kernel trap with interrupts disabled" situation. The later is changed when trap is generated from user mode (shall never be ?).
Normalize the messages to include both pid and thread name.
MFC after: 1 week
|
#
205642 |
|
25-Mar-2010 |
nwhitehorn |
Change the arguments of exec_setregs() so that it receives a pointer to the image_params struct instead of several members of that struct individually. This makes it easier to expand its arguments in the future without touching all platforms.
Reviewed by: jhb
|
#
204309 |
|
25-Feb-2010 |
attilio |
Introduce the new kernel sub-tree x86 which should contain all the code shared and generalized between our current amd64, i386 and pc98.
This is just an initial step that should lead to a more complete effort. For the moment, a very simple porting of cpufreq modules, BIOS calls and the whole MD specific ISA bus part is added to the sub-tree but ideally a lot of code might be added and more shared support should grow.
Sponsored by: Sandvine Incorporated Reviewed by: emaste, kib, jhb, imp Discussed on: arch MFC: 3 weeks
|
#
202897 |
|
23-Jan-2010 |
alc |
Simplify the mapping of the system message buffer. Use the direct map just like ia64 does.
|
#
199253 |
|
13-Nov-2009 |
kib |
Amd64 init_secondary() calls initializecpu() while curthread is still not properly set up. r199067 added the call to TUNABLE_INT_FETCH() to initializecpu() that results in hang because AP are started when kernel environment is already dynamic and thus needs to acquire mutex, that is too early in AP start sequence to work.
Extract the code that should be executed only once, because it sets up global variables, from initializecpu() to initializecpucache(), and call the later only from hammer_time() executed on BSP. Now, TUNABLE_INT_FETCH() is done only once at BSP at the early boot stage.
In collaboration with: Mykola Dzham <freebsd levsha org ua> Reviewed by: jhb Tested by: ed, battlez
|
#
198507 |
|
27-Oct-2009 |
kib |
In r197963, a race with thread being selected for signal delivery while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.
Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race.
Reviewed by: davidxu Tested by: pho MFC after: 1 month
|
#
197410 |
|
22-Sep-2009 |
jhb |
- Split the logic to parse an SMAP entry out into a separate function on amd64 similar to i386. This fixes a bug on amd64 where overlapping entries would not cause the SMAP parsing to stop. - Change the SMAP parsing code to do a sorted insertion into physmap[] instead of an append to support systems with out-of-order SMAP entries.
PR: amd64/138220 Reported by: James R. Van Artsdalen james of jrv org MFC after: 3 days
|
#
196412 |
|
20-Aug-2009 |
jkim |
Check whether the SMBIOS reports reasonable amount of memory. If it is less than "avail memory", fall back to Maxmem to avoid user confusion. We use SMBIOS information to display "real memory" since r190599 but some broken SMBIOS implementation reported only half of actual memory.
Tested by: bz Approved by: re (kib)
|
#
196390 |
|
19-Aug-2009 |
ed |
Make the MacBookPro3,1 hardware boot again.
Tested by: Patrick Lamaiziere <patfbsd davenulle org> Approved by: re (kib)
|
#
196033 |
|
02-Aug-2009 |
ed |
Make the MacBook3,1 boot again.
Approved by: re (kib)
|
#
195907 |
|
27-Jul-2009 |
rpaulo |
Refine the MacBook hack to only match early models that have Intel ICH.
Discussed with: kjim Approved by: re (kib)
|
#
195486 |
|
09-Jul-2009 |
kib |
Restore the segment registers and segment base MSRs for amd64 syscall return path only when neither thread was context switched while executing syscall code nor syscall explicitely modified LDT or MSRs.
Save segment registers in trap handlers before interrupts are enabled, to not allow context switches to happen before registers are saved. Use separated byte in pcb for indication of fast/full return, since pcb_flags are not synchronized with context switches.
The change puts back syscall microbenchmark numbers that were slowed down after commit of the support for LDT on amd64.
Reviewed by: jeff Tested (and tested, and tested ...) by: pho Approved by: re (kensmith)
|
#
195410 |
|
06-Jul-2009 |
jhb |
MFi386: Add a 'show idt' command to DDB to display the non-default function pointers in the interrupt descriptor table.
Approved by: re (kensmith)
|
#
194784 |
|
23-Jun-2009 |
jeff |
Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_*. Requires only one extra instruction more than PCPU_* and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent.
Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas
|
#
193804 |
|
09-Jun-2009 |
ariff |
Move C1E workaround into its own idle function. Previous workaround works only during initial booting process, while there are laptops/BIOSes that tend to act 'smarter' by force enabling C1E if the main power adapter being pulled out, rendering previous workaround ineffective. Given the fact that we still rely on local APIC to drive timer interrupt, this workaround should keep all Turion (probably Phenom too) X\d+ alive whether its on battery power or not.
URL: http://lists.freebsd.org/pipermail/freebsd-acpi/2008-April/004858.html http://lists.freebsd.org/pipermail/freebsd-acpi/2008-May/004888.html
Tested by: Peter Jeremy <peterjeremy at optushome d com d au>
|
#
192323 |
|
18-May-2009 |
marcel |
Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet.
Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.
|
#
192050 |
|
13-May-2009 |
jhb |
Implement simple machine check support for amd64 and i386. - For CPUs that only support MCE (the machine check exception) but not MCA (i.e. Pentium), all this does is print out the value of the machine check registers and then panic when a machine check exception occurs. - For CPUs that support MCA (the machine check architecture), the support is a bit more involved. - First, there is limited support for decoding the CPU-independent MCA error codes in the kernel, and the kernel uses this to output a short description of any machine check events that occur. - When a machine check exception occurs, all of the MCx banks on the current CPU are scanned and any events are reported to the console before panic'ing. - To catch events for correctable errors, a periodic timer kicks off a task which scans the MCx banks on all CPUs. The frequency of these checks is controlled via the "hw.mca.interval" sysctl. - Userland can request an immediate scan of the MCx banks by writing a non-zero value to "hw.mca.force_scan". - If any correctable events are encountered, the appropriate details are stored in a 'struct mca_record' (defined in <machine/mca.h>). The "hw.mca.count" is a count of such records and each record may be queried via the "hw.mca.records" tree by specifying the record index (0 .. count - 1) as the next name in the MIB similar to using PIDs with the kern.proc.* sysctls. The idea is to export machine check events to userland for more detailed processing. - The periodic timer and hw.mca sysctls are only present if the CPU supports MCA.
Discussed with: emaste (briefly) MFC after: 1 month
|
#
190919 |
|
11-Apr-2009 |
ed |
Simplify in/out functions (for i386 and AMD64).
Remove a hack to generate more efficient code for port numbers below 0x100, which has been obsolete for at least ten years, because GCC has an asm constraint to specify that.
Submitted by: Christoph Mallon <christoph mallon gmx de>
|
#
190620 |
|
01-Apr-2009 |
kib |
Save and restore segment registers on amd64 when entering and leaving the kernel on amd64. Fill and read segment registers for mcontext and signals. Handle traps caused by restoration of the invalidated selectors.
Implement user-mode creation and manipulation of the process-specific LDT descriptors for amd64, see sysarch(2).
Implement support for TSS i/o port access permission bitmap for amd64.
Context-switch LDT and TSS. Do not save and restore segment registers on the context switch, that is handled by kernel enter/leave trampolines now. Remove segment restore code from the signal trampolines for freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.
Implement amd64-specific compat shims for sysarch.
Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.
TODO: Currently, gdb is not adapted to show segment registers from struct reg. Also, no machine-depended ptrace command is added to set segment registers for debugged process.
In collaboration with: pho Discussed with: peter Reviewed by: jhb Linuxolator tested by: dchagin
|
#
190619 |
|
01-Apr-2009 |
kib |
Add separate gdt descriptors for %fs and %gs on amd64. Reorder amd64 gdt descriptors so that user-accessible selectors are the same as on i386. At least Wine hard-codes this into the binary.
In collaboration with: pho Reviewed by: jhb
|
#
190600 |
|
31-Mar-2009 |
jkim |
Fix an uninitialized variable from the previous commit.
|
#
190599 |
|
31-Mar-2009 |
jkim |
Probe size of installed memory modules from loader and display it as 'real memory' instead of Maxmem if the value is available. Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640 to unconfuse users. Now it is consistent across amd64 and i386 again. While I am here, clean up smbios.c a bit and update copyright date.
Reviewed by: jhb
|
#
190447 |
|
26-Mar-2009 |
kib |
Convert gdt_segs and ldt_segs initialization to C99 style.
Reviewed by: jhb
|
#
189699 |
|
11-Mar-2009 |
dfr |
Merge in support for Xen HVM on amd64 architecture.
|
#
189423 |
|
05-Mar-2009 |
jhb |
A better fix for handling different FPU initial control words for different ABIs: - Store the FPU initial control word in the pcb for each thread. - When first using the FPU, load the initial control word after restoring the clean state if it is not the standard control word. - Provide a correct control word for Linux/i386 binaries under FreeBSD/amd64. - Adjust the control word returned for fpugetregs()/npxgetregs() when a thread hasn't used the FPU yet to reflect the real initial control word for the current ABI. - The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control word instead of trashing whatever the current state of the FPU is.
Reviewed by: bde
|
#
188065 |
|
03-Feb-2009 |
jkoshy |
Improve robustness of NMI handling, for NMIs recognized in kernel mode.
- Make the NMI handler run on its own stack (TSS_IST2). - Store the GSBASE value for each CPU just before the start of each NMI stack, permitting efficient retrieval using %rsp-relative addressing. - For NMIs taken from kernel mode, program MSR_GSBASE explicitly since one or both of MSR_GSBASE and MSR_KGSBASE can be potentially invalid. The current contents of MSR_GSBASE are saved and restored at exit. - For NMIs handled from user mode, continue to use 'swapgs' to load the per-CPU GSBASE.
Reviewed by: jeff Debugging help: jeff Tested by: gnn, Artem Belevich <artemb at gmail dot com>
|
#
182868 |
|
08-Sep-2008 |
kib |
The pcb_gs32p should be per-cpu, not per-thread pointer. This is location in GDT where the segment descriptor from pcb_gs32sd is copied, and the location is in GDT local to CPU.
Noted and reviewed by: peter MFC after: 1 week
|
#
182865 |
|
08-Sep-2008 |
kib |
Fix inconsistencies in the comments.
MFC after: 1 week
|
#
182684 |
|
02-Sep-2008 |
kib |
- When executing FreeBSD/amd64 binaries from FreeBSD/i386 or Linux/i386 processes, clear PCB_32BIT and PCB_GS32BIT bits [1].
- Reread the fs and gs bases from the msr unconditionally, not believing the values in pcb_fsbase and pcb_gsbase, since usermode may reload segment registers, invalidating the cache. [2].
Both problems resulted in the wrong fs base, causing wrong tls pointer be dereferenced in the usermode.
Reported and tested by: Vyacheslav Bocharov <adeepv at gmail com> [1] Reported by: Bernd Walter <ticsoat cicely7 cicely de>, Artem Belevich <fbsdlist at src cx>[2] Reviewed by: peter MFC after: 3 days
|
#
180393 |
|
09-Jul-2008 |
peter |
Band-aid a problem with 32 bit selector setup.
Initialize %ds, %es, and %fs during CPU startup. Otherwise a garbage value could leak to a 32-bit process if a process migrated to a different CPU after exec and the new CPU had never exec'd a 32-bit process.
A more complete fix is needed, but this mitigates the most frequent manifestations.
Obtained from: ups
|
#
178471 |
|
25-Apr-2008 |
jeff |
- Add an integer argument to idle to indicate how likely we are to wake from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted.
Sponsored by: Nokia
|
#
178429 |
|
22-Apr-2008 |
phk |
Now that all platforms use genclock, shuffle things around slightly for better structure.
Much of this is related to <sys/clock.h>, which should really have been called <sys/calendar.h>, but unless and until we need the name, the repocopy can wait.
In general the kernel does not know about minutes, hours, days, timezones, daylight savings time, leap-years and such. All that is theoretically a matter for userland only.
Parts of kernel code does however care: badly designed filesystems store timestamps in local time and RTC chips almost universally track time in a YY-MM-DD HH:MM:SS format, and sometimes in local timezone instead of UTC. For this we have <sys/clock.h>
<sys/time.h> on the other hand, deals with time_t, timeval, timespec and so on. These know only seconds and fractions thereof.
Move inittodr() and resettodr() prototypes to <sys/time.h>. Retain the names as it is one of the few surviving PDP/VAX references.
Move startrtclock() to <machine/clock.h> on relevant platforms, it is a MD call between machdep.c/clock.c. Remove references to it elsewhere.
Remove a lot of unnecessary <sys/clock.h> includes.
Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs. XXX: should be kern.disable_rtc_set really, it's not MD.
|
#
178314 |
|
19-Apr-2008 |
peter |
Put in a real isa_irq_pending() stub in order to remove two lines of dmesg noise from sio per unit. sio likes to probe if interrupts are configured correctly by looking at the pending bits of the atpic in order to put a non-fatal warning on the console. I think I'd rather read the pending bits from the apics, but I'm not sure its worth the hassle.
|
#
177253 |
|
16-Mar-2008 |
rwatson |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr.
MFC after: 1 month Discussed with: imp, rink
|
#
177145 |
|
13-Mar-2008 |
kib |
Since version 4.3, gcc changed its behaviour concerning the i386/amd64 ABI and the direction flag, that is it now assumes that the direction flag is cleared at the entry of a function and it doesn't clear once more if needed. This new behaviour conforms to the i386/amd64 ABI.
Modify the signal handler frame setup code to clear the DF {e,r}flags bit on the amd64/i386 for the signal handlers.
jhb@ noted that it might break old apps if they assumed DF == 1 would be preserved in the signal handlers, but that such apps should be rare and that older versions of gcc would not generate such apps.
Submitted by: Aurelien Jarno <aurelien aurel32 net> PR: 121422 Reviewed by: jhb MFC after: 2 weeks
|
#
177091 |
|
12-Mar-2008 |
jeff |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
#
174898 |
|
25-Dec-2007 |
rwatson |
Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run.
Assign approximate why values to all current consumers of the kdb_enter() interface.
|
#
174557 |
|
12-Dec-2007 |
rpaulo |
Disallow the legacy USB circuit to generate an SMI# via an ICH register (MacBooks only). This allows MacBooks to boot in SMP mode without any trick and solves the timer problems with HZ=1000.
MFC after: 1 week
Reviewed by: njl (mentor), jhb Approved by: njl (mentor), jhb
|
#
173659 |
|
15-Nov-2007 |
jhb |
Add support for cross double fault frames in stack traces: - Populate the register values for the trapframe put on the stack by the double fault handler. - Teach DDB's trace routine to treat a double fault like other trap frames.
MFC after: 3 days
|
#
173361 |
|
05-Nov-2007 |
kib |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
|
#
173118 |
|
28-Oct-2007 |
jhb |
- Add constants for the different memory types in the SMAP table. - Use the SMAP types and constants from <machine/pc/bios.h> in the boot code rather than duplicating it.
|
#
170368 |
|
06-Jun-2007 |
davidxu |
Backout experimental adaptive-spin umtx code.
|
#
170307 |
|
04-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
#
170253 |
|
03-Jun-2007 |
alc |
Add the machine-specific definitions for configuring the new physical memory allocator.
Set the size of phys_avail[] and dump_avail[] using one of these definitions.
Approved by: re
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
168035 |
|
29-Mar-2007 |
jkim |
MFP4: Linux set_thread_area syscall (aka TLS) support for amd64.
Initial version was submitted by Divacky Roman and mostly rewritten by me.
Tested by: emulation
|
#
166283 |
|
27-Jan-2007 |
jkoshy |
Use a known good stack at the time of servicing an NMI --- reuse the space allocated for the double fault handler since this space is otherwise unused till the time a double fault occurs.
This change should have been committed alongside r1.127 of "exception.S", but I somehow missed doing so.
Problem reported by: jeff Pointy hat to: jkoshy
|
#
166186 |
|
23-Jan-2007 |
bde |
Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported.
Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers.
This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.
|
#
165369 |
|
20-Dec-2006 |
davidxu |
Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count.
Tested on: Athlon64 X2 3800+, Dual Xeon 5130
|
#
164951 |
|
06-Dec-2006 |
sobomax |
Allow machdep.cpu_idle_hlt to be set from the loader. This should allow to workaround the problem with SMP kernels on Turion64 X2 processors described in kern/104678 and may be useful in other situations too.
MFC after: 3 days
|
#
164936 |
|
06-Dec-2006 |
julian |
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it.
Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
|
#
164413 |
|
19-Nov-2006 |
alc |
The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)
|
#
164365 |
|
17-Nov-2006 |
jhb |
Add support for 8 byte hardware watches in long mode. Kernel hardware watches support 8 byte watches. For userland, we disallow 8 byte watches for 32-bit tasks.
|
#
164362 |
|
17-Nov-2006 |
jhb |
- Add macro constants for the various fields in %dr7 and use them in place of various scattered magic values. - Pretty print the address of hardware watchpoints in 'show watch' rather than just displaying hex. - Expand address field width on amd64 for 64-bit pointers.
|
#
164303 |
|
15-Nov-2006 |
jhb |
Various whitespace and style fixes.
|
#
164078 |
|
07-Nov-2006 |
ru |
Spelling.
|
#
164077 |
|
07-Nov-2006 |
ru |
Line up memory amount reporting that got broken when s/real/usable/.
|
#
164064 |
|
07-Nov-2006 |
jhb |
Remove duplicate IDTVEC macro definition, it's already defined in <machine/intr_machdep.h>.
|
#
163709 |
|
26-Oct-2006 |
jb |
Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE).
Reviewed by: davidxu@
|
#
163267 |
|
12-Oct-2006 |
jhb |
Fix nodevice atpic compile.
Pointy hat to: jhb
|
#
163219 |
|
10-Oct-2006 |
jhb |
Change the x86 interrupt code to suspend/resume interrupt controllers (PICs) rather than interrupt sources. This allows interrupt controllers with no interrupt pics (such as the 8259As when APIC is in use) to participate in suspend/resume. - Always register the 8259A PICs even if we don't use any of their pins. - Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't included. - Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC on resume. This gets suspend/resume working with APIC on UP systems. SMP still needs more work to bring the APs back to life.
The MFC after is tentative.
Tested by: anholt (i386) Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3) MFC after: 1 week
|
#
162958 |
|
02-Oct-2006 |
phk |
Second part of a little cleanup in the calendar/timezone/RTC handling.
Split subr_clock.c in two parts (by repo-copy): subr_clock.c contains generic RTC and calendaric stuff. etc. subr_rtc.c contains the newbus'ified RTC interface.
Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock} sysctls and associated variables into subr_clock.c. They are not machine dependent and we have generic code that relies on being present so they are not even optional.
|
#
162954 |
|
02-Oct-2006 |
phk |
First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.
Use libkern's much more time- & spamce-efficient BCD routines.
|
#
162112 |
|
07-Sep-2006 |
jhb |
Use a single constant to define the sizes of the physmap[], phys_avail[], and dump_avail[] arrays so they are in sync (previously it was possible to store more entries in the physmap[] then we could store in phys_avail[], which was pointless). While I'm here, bump up the length of these tables to hold 30 entries on amd64 and 16 on i386. This allows machines with fairly fragmented memory maps to boot ok (at least one machine would not boot FreeBSD/i386 but would boot FreeBSD/amd64 because amd64 allowed for more fragments).
MFC after: 3 days
|
#
160763 |
|
27-Jul-2006 |
jhb |
Don't allow MAXMEM or hw.physmem to extend the top of memory if our memory map was obtained from the SMAP. SMAP is trustworthy, and the memory extending feature is a band-aid for older systems where FreeBSD's methods of detecting memory were not always trustworthy. This fixes the issue where using hw.physmem could result in the ACPI tables getting trashed breaking ACPI.
MFC after: 3 days Tested on: i386
|
#
159782 |
|
19-Jun-2006 |
davidxu |
MFi386: Use the method described in IA-32 Intel Architecture Software Developer's Manual chapter 11.6.6 to get valid mxcsr bits, use the mxcsr mask to clear invalid bits passed by user code.
|
#
158445 |
|
11-May-2006 |
phk |
Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
|
#
156706 |
|
14-Mar-2006 |
jhb |
Don't allow userland to set hardware watch points on kernel memory at all. Previously, we tried to allow this only for root. However, we were calling suser() on the *target* process rather than the current process. This means that if you can ptrace() a process running as root you can set a hardware watch point in the kernel. In practice I think you probably have to be root in order to pass the p_candebug() checks in ptrace() to attach to a process running as root anyway. Rather than fix the suser(), I just axed the entire idea, as I can't think of any good reason _at all_ for userland to set hardware watch points for KVM.
MFC after: 3 days Also thinks hardware watch points on KVM from userland are bad: bde, rwatson
|
#
156694 |
|
13-Mar-2006 |
peter |
Cosmetic sync with i386
|
#
155239 |
|
03-Feb-2006 |
davidxu |
MFi386: Clear carry flag in get_mconetxt so that setcontext does not return a bogus error.
|
#
152753 |
|
24-Nov-2005 |
ru |
Add missing "struct" in i386/i386/machdep.c,v 1.497 by deischen@.
|
#
152651 |
|
21-Nov-2005 |
jhb |
Expand the hack to mask the atpics if 'device atpic' is not in the kernel during boot up. Now we do a full reset of the 8259As and setup a simple interrupt handler (we actually borrow the apic one that just does an immediate iret) to handle any spurious interrupts triggered by either chip. This should fix some folks that were getting a Trap 30 during bootup of certain SMP AMD systems. This might get pushed into the 6.0 branch as an errata. For now a suitable workaround is to add 'device atpic' to your kernel config.
Tested by: scottl Helpful info from: dillon MFC after: 1 week
|
#
151719 |
|
26-Oct-2005 |
peter |
Change PHYSMAP_SIZE to allow for more memory segments. The old value was too low for certain Dell amd64 machines.
|
#
151429 |
|
17-Oct-2005 |
davidxu |
Micro optimization for context switch. Eliminate code for saving gs.base and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase when user wants to set them, in context switch routine, we only need to write them into registers, we never have to read them out from registers when thread is switched away. Since rdmsr is a serialization instruction, micro benchmark shows it is worthy to do.
Reviewed by: peter, jhb
|
#
151316 |
|
14-Oct-2005 |
davidxu |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed.
6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
#
150638 |
|
27-Sep-2005 |
peter |
Don't report Maxmem as 'real memory'. It is really the highest address available and can give the wrong impression when there are memory holes. Report the total amount of usable memory that we detected instead of the highest address.
|
#
150635 |
|
27-Sep-2005 |
peter |
Don't let the upper bits of %dr6/%dr7 get set.
Submitted by: Nate Eldredge <neldredge@math.ucsd.edu>
|
#
147671 |
|
29-Jun-2005 |
peter |
Switch AMD64 and i386 platforms to using ELF as their kernel crash dump format. The key reason to do this is so that we can dump sparse address space. For example, we need to be able to skip the PCI hole just below the 4GB boundary. Trying to destructively dump MMIO device registers is Really Bad(TM). The frequent result of trying to do a crash dump on a machine with 4GB or more ram was ugly (lockup or reboot).
This code has been taken directly from the IA64 dump_machdep.c code, with just a few (mostly minor) mods.
Introduce a dump_avail[] array in the machdep.c code so that we have a source of truth for what memory is present in a machine that needs to be dumped. We can't use phys_avail[] because all sorts of things slice memory out of it that we really need to dump. eg: the vm page array and the dmesg buffer. dump_avail[] is pretty much an unmolested version of phys_avail[]. It does have Maxmem correction.
Bump the i386 and amd64 dump format to version 2, but nothing actually uses this. amd64 was actually using the i386 dump version number.
libkvm support to follow.
Approved by: re
|
#
147569 |
|
23-Jun-2005 |
peter |
Various trivial comment fixes
Approved by: re
|
#
145911 |
|
05-May-2005 |
peter |
Remove unused (besides being initialized) variable.
|
#
145889 |
|
04-May-2005 |
davidxu |
Turn on PCB_FULLCTX in set_regs to fully restore context set by debugger.
|
#
144696 |
|
05-Apr-2005 |
cperciva |
Fully initialize the required TSS fields so that the io permission bitmap is set correctly.
Patch from: peter Security: FreeBSD-SA-05:03.amd64
|
#
144637 |
|
04-Apr-2005 |
jhb |
Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case.
Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch.
This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example).
Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more
|
#
143162 |
|
05-Mar-2005 |
des |
MFi386: use TUNABLE_ULONG_FETCH to retrieve hw.physmem.
|
#
143159 |
|
05-Mar-2005 |
des |
Replace goto with continue.
|
#
142866 |
|
01-Mar-2005 |
obrien |
Catch up with the "physical memory" sysctl change. (MFi386: rev 1.608)
|
#
141378 |
|
05-Feb-2005 |
njl |
Finish the job of sorting all includes and fix the build by including malloc.h before proc.h on sparc64. Noticed by das@
Compiled on: alpha, amd64, i386, pc98, sparc64
|
#
141374 |
|
05-Feb-2005 |
njl |
Make cpu_est_clockrate() more accurate by disabling interrupts for the millisecond it is calibrating. Suggested by jhb@ and bde@. Don't clobber the tsc_freq with the new value since it isn't accurate enough for timecounters and the timecounter system as a whole needs support for changing rates before we do this. Subtract 0.5% from our measurement to account for overhead in DELAY. Note that this interface is for estimating the clockrate and needs to work well at runtime so doing a full calibration including disabling interrupts for a second is not feasible.
|
#
141237 |
|
04-Feb-2005 |
njl |
Add an implementation of cpu_est_clockrate(9). This function estimates the current clock frequency for the given CPU id in units of Hz.
|
#
140555 |
|
21-Jan-2005 |
peter |
JumboMFi386: use bitmapped IPI handler. Update elcr and default mptable config handler. Tidy up various local apic initialization.
|
#
138208 |
|
29-Nov-2004 |
peter |
MFi386: join the %cr0 setup line now that i386 has lost the I386 ifdefs.
|
#
138129 |
|
27-Nov-2004 |
das |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
#
137912 |
|
20-Nov-2004 |
das |
U areas are going away, so don't allocate one for process 0.
Reviewed by: arch@
|
#
137012 |
|
28-Oct-2004 |
simokawa |
MFi386: preserve dcons buffer passed by loader.
|
#
135691 |
|
23-Sep-2004 |
peter |
Like on i386, use the definition of struct bios_smap from machine/pc/bios.h again.
|
#
134791 |
|
05-Sep-2004 |
julian |
Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces.
The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time.
The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own.
Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens.
Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure.
The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring.
A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated.
Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can.
Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
|
#
134232 |
|
23-Aug-2004 |
peter |
Oops, I forgot to have the idle loop call mp_grab_cpu_hlt() on the amd64 SMP case.
|
#
133903 |
|
16-Aug-2004 |
peter |
Sync with i386 - add foot shooting protection for the DDB/KDB thing.
|
#
133431 |
|
10-Aug-2004 |
davidxu |
As AMD64 architecture volume 1 chapter 3.1.2 says, high 32 bits of %rflags are resevered, they can be written with anything, but they always read as zero, we should simulate it in set_regs() as we are reading/writting real hardware %rflags register.
|
#
133194 |
|
06-Aug-2004 |
markm |
MFi386: sort out the mem device. Grrrr.
|
#
132924 |
|
31-Jul-2004 |
davidxu |
Turn on PCB_FULLCTX for set_mcontext, functions like kse_switchin needs to fully restore asynchronous context which did not come from fast syscall.
|
#
132088 |
|
13-Jul-2004 |
davidxu |
Add ptrace_clear_single_step(), alpha already has it for years, the function will be used by ptrace to clear a thread's single step state.
|
#
131941 |
|
10-Jul-2004 |
marcel |
Update for the KDB framework: o Make debugging support conditional upon KDB instead of DDB. o Remove implementation of Debugger(). o Don't make setjump() and longjump() conditional upon DDB. o s/ddb_on_nmi/kdb_on_nmi/g o Call kdb_reenter() when kdb_active is non-zero. Call kdb_trap() otherwise.
|
#
131905 |
|
10-Jul-2004 |
marcel |
Implement makectx(). The makectx() function is used by KDB to create a PCB from a trapframe for purposes of unwinding the stack. The PCB is used as the thread context and all but the thread that entered the debugger has a valid PCB. This function can also be used to create a context for the threads running on the CPUs that have been stopped when the debugger got entered. This however is not done at the time of this commit.
|
#
131775 |
|
07-Jul-2004 |
peter |
MFi386: fix up CR0 settings
|
#
130344 |
|
11-Jun-2004 |
phk |
Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
|
#
130312 |
|
10-Jun-2004 |
jhb |
Remove atdevbase and replace it's remaining uses with direct references to KERNBASE instead.
|
#
130224 |
|
07-Jun-2004 |
peter |
Initial PG_NX support (no-execute page bit) - export the rest of the cpu features (and amd's features). - turn on EFER_NXE, depending on the NX amd feature bit - reorg the identcpu stuff a bit in order to stop treating the amd features as second class features (since it is now a primary feature bit set) and make it easier to export.
|
#
129412 |
|
18-May-2004 |
peter |
Unbreak builds without DDB. Bad Bruce! No cookie! :-)
|
#
129373 |
|
18-May-2004 |
bde |
Fixed DDB_NOKLDSYM on amd64's:
machdep.c: Initialize the symbol table pointers, not quite like for other arches.
db_elf.c: Don't claim to be an i486 in the fake ELF header.
|
#
126732 |
|
07-Mar-2004 |
peter |
MFi386: set initial curpcb pcpu variable at startup time rather than waiting for a context switch
|
#
126246 |
|
25-Feb-2004 |
peter |
Since we don't use PG_NX yet, don't turn on EFER_NXE quite yet. This needs to be done based on the cpuid bits. AMD says that we should test the cpuid features bits for certain things, such as this.
|
#
125183 |
|
28-Jan-2004 |
peter |
Re-add debug register support. Some other minor tweaks snuck in here, including supporting more discontiguous memory segments and some cosmetic tweaks.
|
#
124092 |
|
03-Jan-2004 |
davidxu |
Make sigaltstack as per-threaded, because per-process sigaltstack state is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process.
Reviewed by: deischen, dfr
|
#
123180 |
|
06-Dec-2003 |
peter |
Various whitespace and cosmetic sync-up's with i386.
Approved by: re (scottl)
|
#
122930 |
|
20-Nov-2003 |
peter |
Provide a streamlined '#define curthread __curthread()' for amd64 to avoid the compiler having to parse and optimize the PCPU_GET(curthread) so often. __curthread() is an inline optimized version of PCPU_GET(curthread) that knows that pc_curthread is at offset zero in the pcpu struct. Add a CTASSERT() to catch any possible changes to this. This accounts for just over a 1% wall clock speedup for total kernel compile/link time, and 20% compile time speedup on some specific files depending on which compile options are used.
Approved by: re (jhb)
|
#
122849 |
|
17-Nov-2003 |
peter |
Initial landing of SMP support for FreeBSD/amd64.
- This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.
|
#
122763 |
|
15-Nov-2003 |
njl |
Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to dereference the softc.
|
#
122364 |
|
09-Nov-2003 |
marcel |
Change the clear_ret argument of get_mcontext() to be a flags argument. Since all callers either passed 0 or 1 for clear_ret, define bit 0 in the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI code for possible (but unlikely) future use. The remaining bits are for use by MD code.
This change is triggered by a need on ia64 to have another knob for get_mcontext().
|
#
122295 |
|
08-Nov-2003 |
peter |
Switch from having a fpu "device" to something that is more like the integrated part of the cpu core that it is.
|
#
122292 |
|
08-Nov-2003 |
peter |
The great s/npx/fpu/gi
|
#
121228 |
|
18-Oct-2003 |
njl |
Add the cpu_idle_hook() function pointer so that other idlers can be hooked at runtime. Make C1 sleep (e.g., HLT) be the default. This prepares the way for further ACPI sleep states.
|
#
120367 |
|
22-Sep-2003 |
peter |
Fix patch transcription typo. s/IDT_BPT/IDT_BP/
|
#
120360 |
|
22-Sep-2003 |
peter |
Move basemem variable into global scope so that the MP startup code can refer to it for looking for tables.
|
#
120346 |
|
22-Sep-2003 |
peter |
MFi386 by jhb: use symbolic constants for the IDT entries.
|
#
120345 |
|
22-Sep-2003 |
peter |
MFi386: machdep.c:1.570 clock.c:1.204 by bde: Quick fix for calling DELAY for ddb input in some atkbd-based console drivers. ddb must not use any normal locks but DELAY() normally calls getit() which needs clock_lock. This also removes the need for recursion on clock_lock.
|
#
119924 |
|
09-Sep-2003 |
peter |
Clean up get/set_mcontext() and get/set_fpcontext(). These are operated on data structures on the kernel stack which are guaranteed to be 16 byte aligned by gcc, the amd64 ABI and __aligned(16).
Ensire the tss_rsp0 initial stack pointer is 16 byte aligned in case sizeof(pcb) becomes odd at some point. This is convenient for the interrupt handler case because the ring crossing pushes cause the required odd alignment before the call to the C code.
Have fast_syscall add an additional 8 bytes to ensure that the trapframe has the correct odd alignment for the call to C code. Note that there are no checks to make sure that the trapframe size is appropriate for this.
This makes get/setfpcontext work properly (finally). You get a GPF in kernel mode if any of this is botched without the alignment fixup code that is apparently needed on i386.
|
#
118235 |
|
30-Jul-2003 |
peter |
Cosmetic: fix disorder of opt_kstack_pages.h include.
|
#
118031 |
|
25-Jul-2003 |
obrien |
Use __FBSDID().
Brought to you by: a boring talk at Ottawa Linux Symposium
|
#
117961 |
|
24-Jul-2003 |
davidxu |
Set fault address to si_addr.
Reviewed by: peter
|
#
117943 |
|
23-Jul-2003 |
peter |
Make the breakpoint instruction trap gate available to users. ptrace() needs this.
Submitted by: Mark Kettenis <kettenis@chello.nl>
|
#
117600 |
|
14-Jul-2003 |
davidxu |
Rename thread_siginfo to cpu_thread_siginfo.
Suggested by: jhb
|
#
116958 |
|
28-Jun-2003 |
davidxu |
Add a machine depended function thread_siginfo, SA signal code will use the function to construct a siginfo structure and use the result to export to userland.
Reviewed by: julian
|
#
115432 |
|
31-May-2003 |
peter |
Add acpi to the build. Remove the hack from machdep.c that lies to the loader to shut it up.
|
#
115431 |
|
31-May-2003 |
peter |
Have hammer_time() return the proc0 stack location, and have locore switch to it before calling mi_startup(). The bootstack is WAY too small for running acpica during probe/attach. While here, pass modulep/physfree to the startup routine, rather than writing to the global variables in locore.S.
Approved by: re (amd64/*)
|
#
115251 |
|
23-May-2003 |
peter |
Major pmap rework to take advantage of the larger address space on amd64 systems. Of note: - Implement a direct mapped region using 2MB pages. This eliminates the need for temporary mappings when getting ptes. This supports up to 512GB of physical memory for now. This should be enough for a while. - Implement a 4-tier page table system. Most of the infrastructure is there for 128TB of userland virtual address space, but only 512GB is presently enabled due to a mystery bug somewhere. The design of this was heavily inspired by the alpha pmap.c. - The kernel is moved into the negative address space(!). - The kernel has 2GB of KVM available. - Provide a uma memory allocator to use the direct map region to take advantage of the 2MB TLBs. - Fixed some assumptions in the bus_space macros about the ability to fit virtual addresses in an 'int'.
Notable missing things: - pmap_growkernel() should be able to grow to 512GB of KVM by expanding downwards below kernbase. The kernel must be at the top 2GB of the negative address space because of gcc code generation strategies. - need to fix the >512GB user vm code.
Approved by: re (blanket)
|
#
115093 |
|
17-May-2003 |
peter |
Actually get all the bits for sd_hibase.. it was 16 bits short. oops.
Approved by: re (amd64/* blanket)
|
#
115006 |
|
14-May-2003 |
peter |
Collect the nastiness for preserving the kernel MSR_GSBASE around the load_gs() calls into a single place that is less likely to go wrong.
Eliminate the per-process context switching of MSR_GSBASE, because it should be constant for a single cpu. Instead, save/restore it during the loading of the new %gs selector for the new process.
Approved by: re (amd64/* blanket)
|
#
114987 |
|
14-May-2003 |
peter |
Add BASIC i386 binary support for the amd64 kernel. This is largely stolen from the ia64/ia32 code (indeed there was a repocopy), but I've redone the MD parts and added and fixed a few essential syscalls. It is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic) and p4. The ia64 code has not implemented signal delivery, so I had to do that.
Before you say it, yes, this does need to go in a common place. But we're in a freeze at the moment and I didn't want to risk breaking ia64. I will sort this out after the freeze so that the common code is in a common place.
On the AMD64 side, this required adding segment selector context switch support and some other support infrastructure. The %fs/%gs etc code is hairy because loading %gs will clobber the kernel's current MSR_GSBASE setting. The segment selectors are not used by the kernel, so they're only changed at context switch time or when changing modes. This still needs to be optimized.
Approved by: re (amd64/* blanket)
|
#
114983 |
|
13-May-2003 |
jhb |
- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe.
Reviewed by: arch@ Approved by: re (rwatson)
|
#
114953 |
|
12-May-2003 |
peter |
Really stop the loader from trying to load the acpi module by lying and pretending that it is already here.
Approved by: re (amd64/* stuff)
|
#
114952 |
|
12-May-2003 |
peter |
For the page fault handler, save %cr2 in the outer trap handler so that we do not have to run so long with interrupts disabled. This involved creating tf_addr in the trapframe. Reorganize the trap stubs so that they consistently reserve the stack space and initialize any missing bits.
Approved by: re (amd64 stuff)
|
#
114951 |
|
12-May-2003 |
peter |
Sync ucontext with reality. The struct trapframe changes need to be reflected here.
Approved by: re (blanket amd64/*)
|
#
114928 |
|
12-May-2003 |
peter |
Give a %fs and %gs to userland. Use swapgs to obtain the kernel %GS.base value on entry and exit. This isn't as easy as it sounds because when we recursively trap or interrupt, we have to avoid duplicating the swapgs instruction or we end up back with the userland %gs. I implemented this by testing TF_CS to see if we're coming from supervisor mode already, and check for returning to supervisor. To avoid a race with interrupts in the brief period after beginning executing the handler and before the swapgs, convert all trap gates to interrupt gates, and reenable interrupts immediately after the swapgs. I am not happy with this. There are other possible ways to do this that should be investigated. (eg: storing the GS.base MSR value in the trapframe)
Add some sysarch functions to let the userland code get to this.
Approved by: re (blanket amd64/*)
|
#
114921 |
|
11-May-2003 |
peter |
Make atdevbase long for the KERNBASE > 4GB case
Approved by: re (amd64/* blanket)
|
#
114919 |
|
11-May-2003 |
peter |
Fix printf format errors that were undetected due to using the standard FSF compiler during early development.
|
#
114867 |
|
09-May-2003 |
peter |
Finish translating i386/support.s into amd64 asm - replace bcopy etc with asm versions. This yields about a 5% kernel compile time speedup.
|
#
114837 |
|
08-May-2003 |
peter |
Oops. Turn T_PAGEFLT back into an interrupt gate. It is *critical* that interrupts be disabled and remain disabled until %cr2 is read. Otherwise we can preempt and another process can fault, and by the time we read %cr2, we see a different processes fault address. This Greatly Confuses vm_fault() (to say the least). The i386 port has got this marked as a bug workaround for a Cyrix CPU, which is what lead me astray. Its actually necessary for preemption, regardless of whether Cyrix cpus had a bug or not.
|
#
114821 |
|
07-May-2003 |
peter |
Leave space for the 128 byte red-zone on the stack.
|
#
114820 |
|
07-May-2003 |
peter |
#include <machine/metadata.h> was missing; add it
|
#
114381 |
|
01-May-2003 |
peter |
I changed the numbering of the MODINFOMD_SMAP during the commit, so recognize the old number for my development boxes so I can use old loader/pxeboot for a while if I need to.
|
#
114349 |
|
30-Apr-2003 |
peter |
Commit MD parts of a loosely functional AMD64 port. This is based on a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.
|
#
113998 |
|
24-Apr-2003 |
deischen |
Add an argument to get_mcontext() which specified whether the syscall return values should be cleared. The system calls getcontext() and swapcontext() want to return 0 on success but these contexts can be switched to at a later time so the return values need to be cleared in the saved register sets. Other callers of get_mcontext() would normally want the context without clearing the return values.
Remove the i386-specific context saving from the KSE code. get_mcontext() is not i386-specific any more.
Fix a bad pointer in the alpha get_mcontext() code. The context was being bcopy()'d from &td->tf_frame, but tf_frame is itself a pointer, so the thread was being copied instead. Spotted by jake.
Glanced at by: jake Reviewed by: bde (months ago)
|
#
113682 |
|
18-Apr-2003 |
jhb |
Hold the proc lock for curproc around sigonstack().
|
#
112993 |
|
02-Apr-2003 |
peter |
Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch.
The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap.
Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default.
This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more.
The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it.
One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff.
Glanced at by: jake, jhb
|
#
112888 |
|
31-Mar-2003 |
jeff |
- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
|
#
112883 |
|
31-Mar-2003 |
jeff |
- Change trapsignal() to accept a thread and not a proc. - Change all consumers to pass in a thread.
Right now this does not cause any functional changes but it will be important later when signals can be delivered to specific threads.
|
#
112841 |
|
30-Mar-2003 |
jake |
- Add support for PAE and more than 4 gigs of ram on x86, dependent on the kernel opition 'options PAE'. This will only work with device drivers which either use busdma, or are able to handle 64 bit physical addresses.
Thanks to Lanny Baron from FreeBSD Systems for the loan of a test machine with 6 gigs of ram.
Sponsored by: DARPA, Network Associates Laboratories, FreeBSD Systems
|
#
112687 |
|
26-Mar-2003 |
ps |
Nuke options HTT infavor of machdep.hlt_logical_cpus tunable/sysctl. This keeps the logical cpu's halted in the idle loop. By default the logical cpu's are halted at startup. It is also possible to halt any cpu in the idle loop now using machdep.hlt_cpus.
Examples of how to use this: machdep.hlt_cpus=1 halt cpu0 machdep.hlt_cpus=2 halt cpu1 machdep.hlt_cpus=4 halt cpu2 machdep.hlt_cpus=3 halt cpu0,cpu1
Reviewed by: jhb, peter
|
#
112686 |
|
26-Mar-2003 |
peter |
Halt the cpus in the idle loop for SMP as well for several reasons: 1) Its critical for HTT. There's less foot-shooting opportunity. 2) I've seen significant improvements in interactive response to commands over ssh sessions. I assume this is less lock contention. 3) As incentive to finish the idle cpu IPI wakeup stuff. 4) The machine on my desk was blowing hot air in my general direction because somebody forgot to turn the hlt on, and it saves 50 watts per cpu..
The machdep.cpu_idle_hlt sysctl is still available, but now the default is the same as on UP kernels.
|
#
112569 |
|
24-Mar-2003 |
jake |
- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long.
Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms.
Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
|
#
111167 |
|
20-Feb-2003 |
peter |
Fix fumble in rev 1.525. pmap_kenter()'s second argument is a physical address, not a page index.
Laughed at by: jake
|
#
109994 |
|
28-Jan-2003 |
jake |
Remove BDE_DEBUGGER.
Discussed with: bde
|
#
109027 |
|
09-Jan-2003 |
jhb |
Remove earlysetcpuclass() as it has been OBE.
Suggested by: bde
|
#
107521 |
|
02-Dec-2002 |
deischen |
Align the FPU state in the ucontext and sigcontext to 16 bytes to accomodate the new SSE/XMM floating point save/restore instructions.
This commit is mostly from bde and includes some style nits.
Approved by: re (jhb)
|
#
106977 |
|
16-Nov-2002 |
deischen |
Add getcontext, setcontext, and swapcontext as system calls. Previously these were libc functions but were requested to be made into system calls for atomicity and to coalesce what might be two entrances into the kernel (signal mask setting and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
|
#
106707 |
|
09-Nov-2002 |
iwasaki |
Add a new loader tunable, hw.hasbrokenint12, to indicate that BIOS has broken int 12H. If hw.hasbrokenint12="1" in loader environment, kernel never use BIOS INT 12 call to determine base memory size. Otherwise, kernel use INT 12 in old behaviour. This should fix kernel panic problem caused by 1.544 changes.
MFC after: 1 day
|
#
106697 |
|
09-Nov-2002 |
des |
Print real / avail memory in megabytes rather than kilobytes.
|
#
106605 |
|
07-Nov-2002 |
tmm |
Move the definitions of the hw.physmem, hw.usermem and hw.availpages sysctls to MI code; this reduces code duplication and makes all of them available on sparc64, and the latter two on powerpc. The semantics by the i386 and pc98 hw.availpages is slightly changed: previously, holes between ranges of available pages would be included, while they are excluded now. The new behaviour should be more correct and brings i386 in line with the other architectures.
Move physmem to vm/vm_init.c, where this variable is used in MI code.
|
#
106503 |
|
06-Nov-2002 |
jmallett |
Remove what was a temporary bogus assignment of bits of siginfo_t, as it does not look like the prerequisites to fill it in properly will be in the tree for the upcoming release, but it's mostly done, so there is no need for these to stay around to remind us.
|
#
105950 |
|
25-Oct-2002 |
peter |
Split 4.x and 5.x signal handling so that we can keep 4.x signal handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *' to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re
|
#
105949 |
|
25-Oct-2002 |
iwasaki |
Change method to determine base memory size. Try INT 15H/E820H first, then fall back to the old compatibility method (INT 12H). This is a workaround for newer machines which have broken INT 12H BIOS service implementation.
Reviewed by: -current ML MFC after: 3 days
|
#
105554 |
|
20-Oct-2002 |
phk |
Change the definition of the debugging registers to be an array, so that we can index into it, rather than do pointer gymnastics on a structure containing 8 elements.
Verified by: MD5 hash on the produced .o files.
|
#
104964 |
|
12-Oct-2002 |
jeff |
- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c
Reviewed by: -arch
|
#
104513 |
|
05-Oct-2002 |
deischen |
Fix building of minimal kernels without npx by rearranging ifdefs. Also fix some style bugs in surrounding code, and add a comment about FP state restoral that seems questionable.
Submitted by: bde
|
#
104460 |
|
04-Oct-2002 |
deischen |
Add another temporary hack to allow running older i386 binaries. This will be removed when new versions of syscalls sigreturn() and sigaction() are added (mini is working on this but is in the middle of a move).
This should fix the problem of cvsupd dying.
|
#
104174 |
|
30-Sep-2002 |
obrien |
Save the FP state in the PCB as that is compatable with releng4 binaries.
This is a band-aid until the KSE pthread committers get back on the ground and have their machines setup.
Submitted by: eischen
|
#
103772 |
|
21-Sep-2002 |
mdodd |
- Move the init of %gs and pcb_gs before user_ldt_free(). - Always call load_gs() - Trim comments.
This addresses some of the issues raised by BDE.
|
#
103703 |
|
20-Sep-2002 |
phk |
For reasons now lost in historical fog, the bounds_check_with_label() function were put in i386/i386/machdep.c from where it has been cut and pasted to other architectures with only minor corruption.
Disklabel is really a MI format in many ways, at least it certainly is when you operate on struct disklabel.
Put bounds_check_with_label() back in subr_disklabel.c where it belongs.
Sponsored by: DARPA & NAI Labs.
|
#
103645 |
|
19-Sep-2002 |
mdodd |
From Christian Zander:
This patch addresses a bug that can cause a GPF in the kernel - if a process makes use of i386_set_ldt to install a LDT entry, then loads a corresponding segment descriptor into %gs, forks, and if the child execs.
In this scenario, setregs executes user_ldt_free and then determines how to reset the %gs register:
/* reset %gs as well */ if (pcb == curpcb) load_gs(_udatasel); else pcb->pcb_gs = _udatasel;
This is insufficient in the fork/exec case, since pcb will be equal to curpcb when the child execs; load_gs will reset %gs to _udatasel but it doesn't reset pcb->pcb_gs; upon return from the system call, cpu_switch_load_gs will thus attempt to restore %gs from pcb->pcb_gs and trigger a GPF since all LDT entries have already been cleared.
The fix is to always reset pcb->pcb_gs to _udatasel.
Submitted by: Christian Zander <zander@minion.de> Reviewed by: jake
|
#
103477 |
|
17-Sep-2002 |
sobomax |
Don't reference cpu_fxsr unless CPU_ENABLE_SSE is defined. This fixes kernel in !CPU_ENABLE_SSE case.
|
#
103407 |
|
16-Sep-2002 |
mini |
Add kernel support needed for the KSE-aware libpthread: - Maintain fpu state across signals. - Use ucontext_t's to store KSE thread state. - Synthesize state for the UTS upon each upcall, rather than saving and copying a trapframe. - Save and restore FPU state properly in ucontext_t's.
Reviewed by: deischen, julian Approved by: -arch
|
#
103367 |
|
15-Sep-2002 |
julian |
Allocate KSEs and KSEGRPs separatly and remove them from the proc structure. next step is to allow > 1 to be allocated per process. This would give multi-processor threads. (when the rest of the infrastructure is in place)
While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc are diverging more than they should.. corrective action needed soon.
|
#
103081 |
|
07-Sep-2002 |
jmallett |
Fill out two fields (si_pid, si_uid) in the siginfo structure handed back to userland in the signal handler that were not being iflled out before, but should and can be.
This part of sendsig could be slightly refactored to use an MI interface, or ideally, *sendsig*() would have an API change to accept a siginfo_t, which would be filled out by an MI function in the level above sendsig, and said MI function would make a small call into MD code to fill out the MD parts (some of which may be bogus, such as the si_addr stuff in some places). This would eventually make it possible for parts of the kernel sending signals to set up a siginfo with meaningful information.
Reviewed by: mux MFC after: 2 weeks
|
#
103076 |
|
07-Sep-2002 |
jmallett |
Match the more modern ports and comment the filling of POSIX parts of siginfo with 'Fill in POSIX parts'. (Diff reduction.)
|
#
103064 |
|
07-Sep-2002 |
peter |
Automatically enable CPU_ENABLE_SSE (detect and enable SSE instructions) if compiling with I686_CPU as a target. CPU_DISABLE_SSE will prevent this from happening and will guarantee the code is not compiled in.
I am still not happy with this, but gcc is now generating code that uses these instructions if you set CPUTYPE to p3/p4 or athlon-4/mp/xp or higher.
|
#
102666 |
|
31-Aug-2002 |
peter |
Take a shot at fixing up a whole stack of style and other embarresing unforced errors that Bruce identified. I have not yet addressed all of his concerns.
|
#
102603 |
|
30-Aug-2002 |
ache |
Unbreak kernel build by printing Maxmem using %ld instead of old (now changed) %u
|
#
102600 |
|
30-Aug-2002 |
peter |
Change hw.physmem and hw.usermem to unsigned long like they used to be in the original hardwired sysctl implementation.
The buf size calculator still overflows an integer on machines with large KVA (eg: ia64) where the number of pages does not fit into an int. Use 'long' there.
Change Maxmem and physmem and related variables to 'long', mostly for completeness. Machines are not likely to overflow 'int' pages in the near term, but then again, 640K ought to be enough for anybody. This comes for free on 32 bit machines, so why not?
|
#
102561 |
|
29-Aug-2002 |
jake |
Renamed poorly named setregs to exec_setregs. Moved its prototype to imgact.h with the other exec support functions.
|
#
100275 |
|
17-Jul-2002 |
peter |
Use pmap_kenter() rather than vtopte() and bashing the page tables directly.
|
#
100220 |
|
17-Jul-2002 |
dillon |
Qualify comment on machdep.cpu_idle_hlt. Turning this on on a SMP machine will result in approximately a 4.2% loss of performance (buildworld) and approximately a 5% reduction in power consumption (when idle). Add XXX note on how to really make hlt work (send an IPI to wakeup HLTed cpus on a thread-schedule event? Generate an interrupt somehow?).
|
#
99567 |
|
07-Jul-2002 |
peter |
s/procrunnable/kserunnable/ in a comment
|
#
99537 |
|
07-Jul-2002 |
mux |
One #include <sys/lock.h> is enough.
Submitted by: Olivier Houchard <cognet@ci0.org>
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
98778 |
|
24-Jun-2002 |
peter |
Compile in the cpu halt code even on SMP, instead just default the sysctl (machdep.cpu_idle_hlt) to off in the SMP case. This allows you to turn it on if you wish and do not particularly care about the small window where a cpu will remain halted even when a job is placed on the run queue (until the next clock tick).
|
#
96517 |
|
13-May-2002 |
bde |
Fixed a syntax error (a label not followed by a statement).
|
#
94936 |
|
17-Apr-2002 |
mux |
Rework the kernel environment subsystem. We now convert the static environment needed at boot time to a dynamic subsystem when VM is up. The dynamic kernel environment is protected by an sx lock.
This adds some new functions to manipulate the kernel environment : freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be called after every getenv() when you have finished using the string. testenv() only tests if an environment variable is present, and doesn't require a freeenv() call. setenv() and unsetenv() are self explanatory.
The kenv(2) syscall exports these new functionalities to userland, mainly for kenv(1).
Reviewed by: peter
|
#
94383 |
|
10-Apr-2002 |
alc |
o In osigreturn(), restore all of the registers in one place. o Recent changes to osigreturn() and sigreturn() have made them MPSAFE. Add a comment to this effect.
Submitted by: bde (bullet #1) Reviewed by: jhb (bullet #2)
|
#
94275 |
|
09-Apr-2002 |
phk |
GC various bits and pieces of USERCONFIG from all over the place.
|
#
94197 |
|
08-Apr-2002 |
bde |
Removed ispc98 sysctl completely. Applications should understand that ispc98 isn't set if its sysctl doesn't exist. At least make(1) already understands this.
Approved by: nyan
|
#
94151 |
|
07-Apr-2002 |
phk |
GC the "dumplo" variable, which is no longer used.
A lot of sys/*/*/machdep.c seems not to be.
|
#
93944 |
|
06-Apr-2002 |
nyan |
Remove pc98 code.
|
#
93818 |
|
04-Apr-2002 |
jhb |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
|
#
93793 |
|
04-Apr-2002 |
bde |
Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask.
Avoid locking in userret() in most of the remaining cases.
Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon
|
#
93702 |
|
02-Apr-2002 |
jhb |
- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves.
Tested on: alpha, i386
|
#
93593 |
|
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
#
93461 |
|
30-Mar-2002 |
alc |
Implement i386's (o)sigreturn() like the alpha's: Use copyin() to read the osigcontext or ucontext_t rather than useracc() followed by direct user- space memory accesses. This reduces (o)sigreturn()'s execution time by 5- 50%.
Submitted by: bde
|
#
93273 |
|
27-Mar-2002 |
jeff |
Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares.
Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone.
Approved by: jhb
|
#
93264 |
|
27-Mar-2002 |
dillon |
Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up).
Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement.
This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical*() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary.
Reviewed by: core Approved by: core
|
#
92770 |
|
20-Mar-2002 |
alfred |
Remove __P.
|
#
92548 |
|
18-Mar-2002 |
alc |
Eliminate grow_stack() from (o)sendsig(). If the stack needs to grow, copyout() will page fault and perform grow_stack() from trap_pfault(). These calls to grow_stack() accomplish nothing.
Reviewed by: bde
|
#
92470 |
|
17-Mar-2002 |
alc |
o Stop calling useracc() in (o)sendsig() now that we use copyout() to copy the sigframe to the user's stack. Useracc() takes a non-trivial amount of time. Eliminating it speeds up signal delivery by 15% or more. o Update some comments.
Submitted by: bde
|
#
92018 |
|
10-Mar-2002 |
luigi |
Export a (machine dependent) kernel variable bootdev as machdep.guessed_bootdev, and add code to sysctl to parse its value and give a (not necessarily correct) name to the device we booted from (the main motivation for this code is to use the info in the PicoBSD boot scripts, and the impact on the kernel is minimal).
NOTE: the information available in bootdev is not always reliable, so you should not trust it too much. The parsing code is the same as in boot2.c, and cannot cover all cases -- as it is, it seems to work fine with floppies and IDE disks recognised by the BIOS. It _should_ work as well with SCSI disks recognised by the BIOS. Booting from a CDROM in floppy emulation will return /dev/fd0 (because this is what the BIOS tells us). Booting off the network (e.g. with etherboot) leaves bootdev unset so the value will be printed as "invalid (0xffffffff)".
Finally, this feature might go away at some point, hopefully when we have a more reliable way to get the same information.
MFC-after: 5 days
|
#
91893 |
|
08-Mar-2002 |
phk |
#include <machine/smp.h> in the SMP case. don't include <sys/smp.h> at all.
Fallout from: probably something jake did. Hint by: jhb
|
#
91328 |
|
26-Feb-2002 |
dillon |
revert last commit temporarily due to whining on the lists.
|
#
91315 |
|
26-Feb-2002 |
dillon |
STAGE-1 of 3 commit - allow (but do not require) interrupts to remain enabled in critical sections and streamline critical_enter() and critical_exit().
This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change.
This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs.
This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_*() functions are moved entirely into MD files.
The following changes have been made:
* critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section.
Other architectures are unaffected.
* fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code.
The new code places the burden of entering the critical section in the trampoline code where it belongs.
* I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask.
* In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made.
* Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags.
* FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock.
* ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts.
* Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them.
Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code.
* Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code.
Performance issues:
One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant.
The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer.
The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example).
The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit().
Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
|
#
91262 |
|
25-Feb-2002 |
peter |
Fix a warning. useracc() should take a const pointer argument.
|
#
90776 |
|
17-Feb-2002 |
deischen |
Use struct __ucontext in prototypes and associated functions instead of ucontext_t. Forward declare struct __ucontext in <sys/signal.h> and remove reliance on <sys/ucontext.h> being included.
While I'm here, also hide osigcontext types from userland; suggested by bde.
Namespace pollution noticed by: Kevin Day <toasty@shell.dragondata.com>
|
#
90718 |
|
16-Feb-2002 |
bde |
Don't leave garbage in parts of fpregs in the fxsr case. All callers (procfs and ptrace) supply kernel stack garbage, so kernel context was leaked to userland.
Reviewed by: des
|
#
90633 |
|
13-Feb-2002 |
bde |
Don't confuse a struct with its first member. This fixes: ./@/i386/i386/machdep.c: In function `init386': ./@/i386/i386/machdep.c:1700: warning: assignment from incompatible pointer type
|
#
90372 |
|
07-Feb-2002 |
peter |
Attempt to patch up some style bugs introduced in the previous commit
|
#
90361 |
|
07-Feb-2002 |
julian |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
#
90132 |
|
03-Feb-2002 |
bde |
Use osigreturn(2) instead of sigreturn(2) plus broken magic for returning from old signal handlers. This is simpler and faster, and fixes (new) sigreturn(2) when %eip in the new signal context happens to match the magic value (0x1d516). 0x1d516 is below the default ELF text section, so this probably never broken anything in practice.
locore.s: In addition, don't build the signal trampoline for old signal handlers when it is not used.
alpha: Not fixed, but seems to be even less broken in practice due to more advanced magic. A false match occurs for register #32 in mc_regs[]. Since there is no hardware register #32, a false match is only possible for direct calls to sigreturn(2) that happen to have the magic number in the spare mc_regs[32] field.
|
#
90128 |
|
03-Feb-2002 |
bde |
Improve the change in the previous commit: use a stub for osigreturn() when it is not really used instead of unconditionalizing all of it.
|
#
90065 |
|
01-Feb-2002 |
bde |
Compile osigreturn() unconditionally since it will always be needed on some arches and the syscall table is machine-independent. It was (bogusly) conditional on COMPAT_43, so this usually makes no difference.
ia64: in addition: - replace the bogus cloned comment before osigreturn() by a correct one. osigreturn() is just a stub fo ia64's. - fix the formatting of cloned comment before sigreturn(). - fix the return code. use nosys() instead of returning ENOSYS to get the same semantics as if the syscall is not in the syscall table. Generating SIGSYS is actually correct here. - fix style bugs.
powerpc: copy the cleaned up ia64 stub. This mainly fixes a bogus comment.
sparc64: copy the cleaned up the ia64 stub, since there was no stub before.
|
#
89988 |
|
30-Jan-2002 |
bde |
Cleaned up the 0ldSiG magic check before removing it. Just use fuword() to fetch the magic word instead of useracc() plus a direct access. This is more efficient as well as simpler and less incorrect: - it was inefficent because useracc() takes much longer than just accessing the data using a correct access method, at least on i386's. - it was incorrect because direct access is incorrect unless the address has been mapped. This and nearby direct accesses are mostly handled better for other arches because they have to be (direct accesses don't work). - using magic in sigreturn is still fundamentally broken because false matches are possible. On i386's, a false match occurs when %eip in a new signal context happens to equal the magic value. This is not handled better for other arches.
|
#
89195 |
|
10-Jan-2002 |
bde |
Clear the single-step flag for signal handlers. This fixes bogus trace traps on the first instruction of signal handlers.
In trap.c:syscall(), fake a trace trap if the single-step flag was set on entry to the kernel, not if it will be set on exit from the kernel. This fixes bogus trace traps after the last instruction of signal handlers.
gdb-4.18 (the version in FreeBSD) still has problems with the program in the PR. These seem to be due to bugs in gdb and not in FreeBSD, and are fixed in gdb-5.1 (the distribution version).
PR: 33262 Tested by: k Macy <kip_macy@yahoo.com> MFC after: 1 day
|
#
89175 |
|
10-Jan-2002 |
deischen |
Use a spare slot in the machine context for a flags word to indicate whether the machine context is valid and whether the FPU state is valid (saved).
Mark the machine context valid before copying it out when sending a signal.
Approved by: -arch
|
#
88322 |
|
20-Dec-2001 |
jhb |
Introduce a standard name for the lock protecting an interrupt controller and it's associated state variables: icu_lock with the name "icu". This renames the imen_mtx for x86 SMP, but also uses the lock to protect access to the 8259 PIC on x86 UP. This also adds an appropriate lock to the various Alpha chipsets which fixes problems with Alpha SMP machines dropping interrupts with an SMP kernel.
|
#
87702 |
|
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
#
87546 |
|
08-Dec-2001 |
dillon |
Allow maxusers to be specified as 0 in the kernel config, which will cause the system to auto-size to between 32 and 512 depending on the amount of memory.
MFC after: 1 week
|
#
86485 |
|
16-Nov-2001 |
peter |
Start bringing i386/pmap.c into line with cleanups that were done to alpha pmap. In particular - - pd_entry_t and pt_entry_t are now u_int32_t instead of a pointer. This is to enable cleaner PAE and x86-64 support down the track sor that we can change the pd_entry_t/pt_entry_t types to 64 bit entities. - Terminate "unsigned *ptep, pte" with extreme prejudice and use the correct pt_entry_t/pd_entry_t types. - Various other cosmetic changes to match cleanups elsewhere. - This eliminates a boatload of casts. - use VM_MAXUSER_ADDRESS in place of UPT_MIN_ADDRESS in a couple of places where we're testing user address space limits. Assuming the page tables start directly after the end of user space is not a safe assumption. There is still more to go.
|
#
85695 |
|
29-Oct-2001 |
bde |
Don't set CR0_NE in cpu_setregs() for the SMP case, since setting it is npx.c's job and setting it here breaks the edit-time option of not setting it in npx.c. (It is not set in the right places for the SMP case, but always setting it here is harmless because there isn't even an edit-time option to not set it.)
|
#
85449 |
|
24-Oct-2001 |
jhb |
Split the per-process Local Descriptor Table out of the PCB and into struct mdproc.
Submitted by: Andrew R. Reiter <arr@watson.org> Silence on: -current
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
83163 |
|
06-Sep-2001 |
jhb |
Call sendsig() with the proc lock held and return with it held.
|
#
82939 |
|
04-Sep-2001 |
peter |
Zap #if 0'ed map init code that got moved to the MI area. Convert the powerpc tree to use the common code.
|
#
82393 |
|
27-Aug-2001 |
peter |
Enable hardwiring of things like tunables from embedded enironments that do not start from loader(8).
|
#
82309 |
|
25-Aug-2001 |
peter |
Optionize UPAGES for the i386. As part of this I split some of the low level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc.
This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.
|
#
82165 |
|
22-Aug-2001 |
peter |
Fix a comment error that was fixed in the pc98 version. hw.maxmem is really hw.physmem.
|
#
82157 |
|
22-Aug-2001 |
peter |
Dont add UPAGES to the %cs segment limit. There is nothing there except page tables.
|
#
82127 |
|
22-Aug-2001 |
dillon |
Move most of the kernel submap initialization code, including the timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal.
Reviewed by: freebsd-smp, jake
|
#
82031 |
|
21-Aug-2001 |
dillon |
Fix bug in physmem_est calculation - the kernel_map size was not being converted into pages.
Fix bug in maxbcache calculation, nbuf must be tested against maxbcache rather then physmem_est.
Obtained from: bde
|
#
82025 |
|
21-Aug-2001 |
peter |
Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this stuff.
|
#
81933 |
|
19-Aug-2001 |
dillon |
Limit the amount of KVM reserved for the buffer cache and for swap-meta information. The default limits only effect machines with > 1GB of ram and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and kern.maxbcache. This has the effect of leaving more KVM available for sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad adds memory to a machine and then sees the kernel panic on boot due to running out of KVM.
Also change the default swap-meta auto-sizing calculation to allocate half of what it was previously allocating. The prior defaults were way too high. Note that we cannot afford to run out of swap-meta structures so we still stay somewhat conservative here.
|
#
81704 |
|
15-Aug-2001 |
jhb |
Whitespace fixes to make this mostly fit in 80 columns.
|
#
81584 |
|
13-Aug-2001 |
bde |
Use interrupt gates instead of trap gates for breakpoint and trace traps, so that ddb can keep control (almost) no matter how it is entered. This breaks time-critical interrupts while the system is stopped in ddb, but I haven't noticed any significant problems except that applications become confused about the time. Lost time will be adjusted for later. Anyway, the half-baked disabling of interrupts in Debugger() gives the same problems for the usual way of entering ddb.
|
#
81544 |
|
12-Aug-2001 |
iwasaki |
Fix some trivial bugs. - fix segment limit mis-calculation for GCODE_SEL, GDATA_SEL, GPRIV_SEL, LUCODE_SEL and LUDATA_SEL. - move `loader(8) metadata' related printf() after cninit(). - use atop macro (address to pages) for segment limit calculation instead of i386_btop macro (bytes to pages). - fix style bugs for the declarations of ints.
Reviewed by: bde, msmith (and arch & audit ML)
|
#
81265 |
|
08-Aug-2001 |
peter |
Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they are a really nasty interface that should have been killed long ago when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they operate on (struct user) will not be around much longer since it is part-per-process and part-per-thread in a post-KSE world.
gdb does not actually use this except for the obscure 'info udot' command which does a hexdump of as much of the child's 'struct user' as it can get. It carries its own #defines so it doesn't break compiles.
|
#
80421 |
|
26-Jul-2001 |
peter |
Call the early tunable setup functions as soon as kern_envp is available. Some things depend on hz being set not long after this.
|
#
79893 |
|
19-Jul-2001 |
bsd |
swtch.s: During context save, use the correct bit mask for clearing the non-reserved bits of dr7.
During context restore, load dr7 in such a way as to not disturb reserved bits.
machdep.c: Don't explicitly disallow the setting of the reserved bits in dr7 since we now keep from setting them when we load dr7 from the PCB.
This allows one to write back the dr7 value obtained from the system without triggering an EINVAL (one of the reserved bits always seems to be set after taking a trace trap).
MFC after: 7 days
|
#
79609 |
|
12-Jul-2001 |
peter |
Activate SSE/SIMD. This is the extra context switching support that we are required to do if we let user processes use the extra 128 bit registers etc.
This is the base part of the diff I got from: http://www.issei.org/issei/FreeBSD/sse.html I believe this is by: Mr. SUZUKI Issei <issei@issei.org> SMP support apparently by: Takekazu KATO <kato@chino.it.okayama-u.ac.jp> Test code by: NAKAMURA Kazushi <kaz@kobe1995.net>, see http://kobe1995.net/~kaz/FreeBSD/SSE.en.html
I have fixed a couple of style(9) deviations. I have some followup commits to fix a couple of non-style things.
|
#
79573 |
|
11-Jul-2001 |
bsd |
Add 'hwatch' and 'dhwatch' ddb commands analogous to 'watch' and 'dwatch'. The new commands install hardware watchpoints if supported by the architecture and if there are enough registers to cover the desired memory area.
No objection by: audit@, hackers@
MFC after: 2 weeks
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
79153 |
|
03-Jul-2001 |
tmm |
Make the code to read the kernel message buffer via sysctl machine- independent and rename the corresponding sysctls from machdep.msgbuf and machdep.msgbuf_clear (i386 only) to kern.msgbuf and kern.msgbuf_clear.
|
#
78981 |
|
29-Jun-2001 |
imp |
Remove cruft from old bus.
|
#
78962 |
|
29-Jun-2001 |
jhb |
Add a new MI pointer to the process' trapframe p_frame instead of using various differently named pointers buried under p_md.
Reviewed by: jake (in principle)
|
#
78631 |
|
22-Jun-2001 |
peter |
Make the hw.physmem and hw.usermem variables unsigned so that they dont come up as negative on machines with >2GB ram.
|
#
78427 |
|
18-Jun-2001 |
jhb |
Initialize mutexes needed early on all in the same place so that the startup routine more closely matches that of alpha and ia64. At some point the common mutexes shared across all platforms probably should move into sys/kern_mutex.c.
|
#
78135 |
|
12-Jun-2001 |
peter |
Hints overhaul: - Replace some very poorly thought out API hacks that should have been fixed a long while ago. - Provide some much more flexible search functions (resource_find_*()) - Use strings for storage instead of an outgrowth of the rather inconvenient temporary ioconf table from config(). We already had a fallback to using strings before malloc/vm was running anyway.
|
#
77582 |
|
01-Jun-2001 |
tmm |
Clean up the code exporting interrupt statistics via sysctl a bit: - move the sysctl code to kern_intr.c - do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine the length of the intrcnt array - move the declarations of intrnames, eintrnames, intrcnt and eintrcnt from machine-dependent include files to sys/interrupt.h - remove the hw.nintr sysctl, it is not needed. - fix various style bugs
Requested by: bde Reviewed by: bde (some time ago)
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76770 |
|
17-May-2001 |
jhb |
- Move the setting of bootverbose to a MI SI_SUB_TUNABLES SYSINIT. - Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be toggled after boot. - Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just afer the display of the copyright message instead of doing it by hand in three MD places.
|
#
76650 |
|
15-May-2001 |
jhb |
Remove unneeded includes of sys/ipl.h and machine/ipl.h.
|
#
76525 |
|
12-May-2001 |
deischen |
Revert part of last commit. Instead of using %fs for KSD/TSD, we'll follow Linux' convention and use %gs. This adds back the setting of %fs to a sane value in sendsig(). The value of %gs remains preserved to whatever it was in user context.
|
#
76440 |
|
10-May-2001 |
jhb |
- Split out the support for per-CPU data from the SMP code. UP kernels have per-CPU data and gdb on the i386 at least needs access to it. - Clean up includes in kern_idle.c and subr_smp.c.
Reviewed by: jake
|
#
76298 |
|
06-May-2001 |
deischen |
When setting up the frame to invoke a signal handler, preserve the %fs and %gs registers instead of setting them to known sane values. %fs is going to be used for thread/KSE specific data by the new threads library; we'll want it to be valid inside of signal handlers.
According to bde, Linux preserves the state of %fs and %gs when setting up signal handlers, so there is precedent for doing this.
The same changes should be made in the Linux emulator, but when made, they seem to break (at least one version of) the IBM JDK for Linux (reported by drew).
Approved by: bde
|
#
76078 |
|
27-Apr-2001 |
jhb |
Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP.
- It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the *_process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the *_process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the *_process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the *_process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in.
Reviewed by: jake, peter Looked over by: eivind
|
#
74912 |
|
28-Mar-2001 |
jhb |
Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.
|
#
74670 |
|
23-Mar-2001 |
tmm |
Export intrnames and intrcnt as sysctls (hw.nintr, hw.intrnames and hw.intrcnt).
Approved by: rwatson
|
#
73929 |
|
07-Mar-2001 |
jhb |
Grab the process lock while calling psignal and before calling psignal.
|
#
73001 |
|
25-Feb-2001 |
jake |
- Rename the lcall system call handler from Xsyscall to Xlcall_syscall to be more like Xint0x80_syscall and less like c function syscall(). - Reduce code duplication between the int0x80 and lcall handlers by shuffling the elfags into the right place, saving the sizeof the instruction in tf_err and jumping into the common int0x80 code.
Reviewed by: peter
|
#
72930 |
|
22-Feb-2001 |
peter |
Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago.
Reviewed by: jhb, imp
|
#
72746 |
|
20-Feb-2001 |
jhb |
- Don't call clear_resched() in userret(), instead, clear the resched flag in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.
|
#
72226 |
|
09-Feb-2001 |
jhb |
Move the initailization of the proc lock for proc0 very early into the MD startup code.
|
#
72200 |
|
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
#
71983 |
|
04-Feb-2001 |
dillon |
This commit represents work mainly submitted by Tor and slightly modified by myself. It solves a serious vm_map corruption problem that can occur with the buffer cache when block sizes > 64K are used. This code has been heavily tested in -stable but only tested somewhat on -current. An MFC will occur in a few days. My additions include the vm_map_simplify_entry() and minor buffer cache boundry case fix.
Make the buffer cache use a system map for buffer cache KVM rather then a normal map.
Ensure that VM objects are not allocated for system maps. There were cases where a buffer map could wind up with a backing VM object -- normally harmless, but this could also result in the buffer cache blocking in places where it assumes no blocking will occur, possibly resulting in corrupted maps.
Fix a minor boundry case in the buffer cache size limit is reached that could result in non-optimal code.
Add vm_map_simplify_entry() calls to prevent 'creeping proliferation' of vm_map_entry's in the buffer cache's vm_map. Previously only a simple linear optimization was made. (The buffer vm_map typically has only a handful of vm_map_entry's. This stabilizes it at that level permanently).
PR: 20609 Submitted by: (Tor Egge) tegge
|
#
71785 |
|
29-Jan-2001 |
peter |
Send "#if NISA > 0" to the bit-bucket and replace it with an option. These were compile-time "is the isa code present?" tests and not 'how many isa busses' tests.
|
#
71647 |
|
25-Jan-2001 |
jhb |
Whitespace fix: convert code indented 6 spaces to use tabs instead.
|
#
71576 |
|
24-Jan-2001 |
jasone |
Convert all simplelocks to mutexes and remove the simplelock implementations.
|
#
71524 |
|
24-Jan-2001 |
jhb |
- Proc locking. - Setup proc0.p_heldmtx, proc0.contested, and curproc earlier so that we can use mutexes. - Initialize sched_lock and Giant earlier and enter Giant during init386. - Use suser(9) instead of checking cr_uid directly.
|
#
71320 |
|
21-Jan-2001 |
jasone |
Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex initialization until after malloc() is safe to call, then iterate through all mutexes and complete their initialization.
This change is necessary in order to avoid some circular bootstrapping dependencies.
|
#
71261 |
|
19-Jan-2001 |
peter |
Zap unused #include "apm.h"
|
#
71257 |
|
19-Jan-2001 |
peter |
Use #ifdef DEV_NPX from opt_npx.h instead of #if NNPX > 0 from npx.h
|
#
71228 |
|
18-Jan-2001 |
bmilekic |
Implement MTX_RECURSE flag for mtx_init(). All calls to mtx_init() for mutexes that recurse must now include the MTX_RECURSE bit in the flag argument variable. This change is in preparation for an upcoming (further) mutex API cleanup. The witness code will call panic() if a lock is found to recurse but the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far: eventhandler, Giant, callout, sched_lock, possibly some others declared in the architecture-specific code, all of the network card driver locks in pci/, as well as some other locks in dev/ stuff that I've found to be recursive.
Reviewed by: jhb
|
#
71098 |
|
16-Jan-2001 |
peter |
Stop doing runtime checking on i386 cpus for cpu class. The cpu is slow enough as it is, without having to constantly check that it really is an i386 still. It was possible to compile out the conditionals for faster cpus by leaving out 'I386_CPU', but it was not possible to unconditionally compile for the i386. You got the runtime checking whether you wanted it or not. This makes I386_CPU mutually exclusive with the other cpu types, and tidies things up a little in the process.
Reviewed by: alfred, markm, phk, benno, jlemon, jhb, jake, grog, msmith, jasone, dcs, des (and a bunch more people who encouraged it)
|
#
70950 |
|
12-Jan-2001 |
bmilekic |
Remove useless include of sys/mbuf.h (no longer useful since the mbuf subsystem init was moved to a better place).
|
#
70861 |
|
10-Jan-2001 |
jake |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
#
70714 |
|
06-Jan-2001 |
jake |
Use %fs to access per-cpu variables in uni-processor kernels the same as multi-processor kernels. The old way made it difficult for kernel modules to be portable between uni-processor and multi-processor kernels. It is no longer necessary to jump through hoops.
- always load %fs with the private segment on entry to the kernel - change the type of the self referntial pointer from struct privatespace to struct globaldata - make the globaldata symbol have value 0 in all cases, so the symbols in globals.s are always offsets, not aliases for fields in globaldata - define the globaldata space used for uniprocessor kernels in C, rather than assembler - change the assmebly language accessors to use %fs, add a macro PCPU_ADDR(member, reg), which loads the register reg with the address of the per-cpu variable member
|
#
69972 |
|
13-Dec-2000 |
tanimura |
- If swap metadata does not fit into the KVM, reduce the number of struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits.
- Reject swapon(2) upon failure of swap_zone allocation.
This is just a temporary fix. Better solutions include: (suggested by: dillon)
o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves.
Reviewed by: alfred, dillon
|
#
69586 |
|
04-Dec-2000 |
jake |
Remove the last of the MD netisr code. It is now all MI. Remove spending, which was unused now that all software interrupts have their own thread. Make the legacy schednetisr use an atomic op for setting bits in the netisr mask.
Reviewed by: jhb
|
#
69379 |
|
30-Nov-2000 |
marcel |
Don't use p->p_sigstk.ss_flags to keep state of whether the process is on the alternate stack or not. For compatibility with sigstack(2) state is being updated if such is needed.
We now determine whether the process is on the alternate stack by looking at its stack pointer. This allows a process to siglongjmp from a signal handler on the alternate stack to the place of the sigsetjmp on the normal stack. When maintaining state, this would have invalidated the state information and causing a subsequent signal to be delivered on the normal stack instead of the alternate stack.
PR: 22286
|
#
69147 |
|
25-Nov-2000 |
jlemon |
Revert the last commit to the callout interface, and add a flag to callout_init() indicating whether the callout is safe or not. Update the callers of callout_init() to reflect the new interface.
Okayed by: Jake
|
#
68889 |
|
19-Nov-2000 |
jake |
- Protect the callout wheel with a separate spin mutex, callout_lock. - Use the mutex in hardclock to ensure no races between it and softclock. - Make softclock be INTR_MPSAFE and provide a flag, CALLOUT_MPSAFE, which specifies that a callout handler does not need giant. There is still no way to set this flag when regstering a callout.
Reviewed by: -smp@, jlemon
|
#
67708 |
|
27-Oct-2000 |
phk |
Convert all users of fldoff() to offsetof(). fldoff() is bad because it only takes a struct tag which makes it impossible to use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various. Significant brucifications by: bde
|
#
67694 |
|
27-Oct-2000 |
bde |
Declare or #define per-cpu globals in <machine/globals.h> in all cases. The i386 UP case was messily different.
|
#
67357 |
|
20-Oct-2000 |
jhb |
- machine/mutex.h -> sys/mutex.h - Use MUTEX_DECLARE() and MTX_COLD for Giant and sched_lock.
|
#
67308 |
|
19-Oct-2000 |
jhb |
Axe the idle_event eventhandler, and add a MD cpu_idle function used for things such as halting CPU's, idling CPU's, etc.
Discussed with: msmith
|
#
67265 |
|
17-Oct-2000 |
jhb |
- Catch up to moving headers, machine/ipl.h -> sys/ipl.h - Fix some whitespace bogons.
Submitted by: bde (2)
|
#
66716 |
|
06-Oct-2000 |
jhb |
- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast().
Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())
|
#
66559 |
|
02-Oct-2000 |
peter |
Fix a cosmetic sign problem on machines with 4G of ram. 0x00312000 - 0xe5fe7fff, 3855441920 bytes (4294859990 pages) .. becomes 0x00314000 - 0xe5fe7fff, 3855433728 bytes (941268 pages)
|
#
66489 |
|
30-Sep-2000 |
msmith |
More updates to the ACPI code:
- Move all register I/O into acpi_io.c - Move event handling into acpi_event.c - Reorganise headers into acpivar/acpireg/acpiio - Move find-RSDT and find-ACPI-owned-memory into acpi_machdep - Allocate all resources (except those detailed only by AML) as real resources. Add infrastructure that will make adding resource support to AML code easy. - Remove all ACPI #ifdefs in non-ACPI code - Removed unnecessary includes - Minor style and commenting fixes
Reviewed by: iwasaki
|
#
66475 |
|
30-Sep-2000 |
bmilekic |
Big mbuf subsystem diff #1: incorporate mutexes and fix things up somewhat to accomodate the changes.
Here's a list of things that have changed (I may have left out a few); for a relatively complete list, see http://people.freebsd.org/~bmilekic/mtx_journal
* Remove old (once useful) mcluster code for MCLBYTES > PAGE_SIZE which nobody uses anymore. It was great while it lasted, but now we're moving onto bigger and better things (Approved by: wollman).
* Practically re-wrote the allocation macros in sys/sys/mbuf.h to accomodate new allocations which grab the necessary lock.
* Make sure that necessary mbstat variables are manipulated with corresponding atomic() routines.
* Changed the "wait" routines, cleaned it up, made one routine that does the job.
* Generalized MWAKEUP() macro. Got rid of m_retry and m_retryhdr, as they are now included in the generalized "wait" routines.
* Sleep routines now use msleep().
* Free lists have locks.
* etc... probably other stuff I'm missing...
Things to look out for and work on later:
* find a better way to (dynamically) adjust EXT_COUNTERS
* move necessity to recurse on a lock from drain routines by providing lock-free lower-level version of MFREE() (and possibly m_free()?).
* checkout include of mutex.h in sys/sys/mbuf.h - probably violating general philosophy here.
The code has been reviewed quite a bit, but problems may arise... please, don't panic! Send me Emails: bmilekic@freebsd.org
Reviewed by: jlemon, cp, alfred, others?
|
#
66277 |
|
22-Sep-2000 |
ps |
Remove the NCPU, NAPIC, NBUS, NINTR config options. Make NAPIC, NBUS, NINTR dynamic and set NCPU to a maximum of 16 under SMP.
Reviewed by: peter
|
#
66206 |
|
22-Sep-2000 |
msmith |
Implement halt-on-idle in the !SMP case, which should significantly reduce power consumption on most systems.
|
#
65856 |
|
14-Sep-2000 |
jhb |
Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just use struct mtx, struct witness, and struct witness_blessed.
Requested by: bde
|
#
65811 |
|
13-Sep-2000 |
bde |
Fixed hang on booting with -d. mtx_enter() was called on an uninitialized lock. The quick fix in trap.c was not quite the version tested and had no effect; back it out.
|
#
65587 |
|
07-Sep-2000 |
jake |
Don't use currentldt as an L-value. This should fix options USER_LDT.
Reported-by: John Hay <jhay@zibbi.mikom.csir.co.za> Nickolay Dudorov <nnd@mail.nsk.ru>
|
#
65557 |
|
06-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
65389 |
|
03-Sep-2000 |
peter |
Complain if we cannot find loader(8) metadata.
|
#
65292 |
|
31-Aug-2000 |
takawata |
Merge rest piece of ACPI driver.To activate acpi driver ,add
device acpi
line. Merge finished. But still experimental phase.Need more hack!
Obtained from:ACPI for FreeBSD project
|
#
64837 |
|
19-Aug-2000 |
dwmalone |
Replace the mbuf external reference counting code with something that should be better.
The old code counted references to mbuf clusters by using the offset of the cluster from the start of memory allocated for mbufs and clusters as an index into an array of chars, which did the reference counting. If the external storage was not a cluster then reference counting had to be done by the code using that external storage.
NetBSD's system of linked lists of mbufs was cosidered, but Alfred felt it would have locking issues when the kernel was made more SMP friendly.
The system implimented uses a pool of unions to track external storage. The union contains an int for counting the references and a pointer for forming a free list. The reference counts are incremented and decremented atomically and so should be SMP friendly. This system can track reference counts for any sort of external storage.
Access to the reference counting stuff is now through macros defined in mbuf.h, so it should be easier to make changes to the system in the future.
The possibility of storing the reference count in one of the referencing mbufs was considered, but was rejected 'cos it would often leave extra mbufs allocated. Storing the reference count in the cluster was also considered, but because the external storage may not be a cluster this isn't an option.
The size of the pool of reference counters is available in the stats provided by "netstat -m".
PR: 19866 Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: alfred (glanced at by others on -net)
|
#
64781 |
|
17-Aug-2000 |
bsd |
Don't let an illegal value for dr7 get set, which can lead to an unexpected TRCTRAP.
Reported by: John W. De Boskey <jwd@FreeBSD.org>
|
#
64592 |
|
13-Aug-2000 |
jhb |
Include machine/cputypes.h so we get the cpu_class variable. This is needed if I386_CPU is defined in the kernel config file.
|
#
64529 |
|
11-Aug-2000 |
peter |
Clean up some low level bootstrap code:
- stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks.
With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely).
There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that.
The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.
|
#
62870 |
|
10-Jul-2000 |
kris |
Don't call printf with no format string.
Reviewed by: msmith
|
#
62573 |
|
04-Jul-2000 |
phk |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by: bde
|
#
62454 |
|
03-Jul-2000 |
phk |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources:
-sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
#
62057 |
|
25-Jun-2000 |
markm |
Strip out the machine-independant parts of the memory device. /dev/(u)random, /dev/null, /dev/zero are all moving to machine-independant drivers. Reviewed by: dfr
|
#
61474 |
|
10-Jun-2000 |
peter |
Unused include: #include "ether.h"
|
#
61220 |
|
03-Jun-2000 |
bde |
Fixed some style bugs in the signal handling funcations. This doesn't change the object file.
|
#
60041 |
|
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
#
59839 |
|
01-May-2000 |
peter |
Move the MSG* and SEM* options to opt_sysvipc.h Remove evil allocation macros from machdep.c (why was that there???) and use malloc() instead. Move paramters out of param.h and into the code itself. Move a bunch of internal definitions from public sys/*.h headers (without #ifdef _KERNEL even) into the code itself.
I had hoped to make some of this more dynamic, but the cost of doing wakeups on all sleeping processes on old arrays was too frightening. The other possibility is to initialize on the first use, and allow dynamic sysctl changes to parameters right until that point. That would allow /etc/rc.sysctl to change SEM* and MSG* defaults as we presently do with SHM*, but without the nightmare of changing a running system.
|
#
59604 |
|
24-Apr-2000 |
obrien |
* Use sys/sys/random.h rather than a i386 specific one. * There was nothing that should be machine dependant about i386/isa/random_machdep.c, so it is now sys/kern/kern_random.c.
|
#
59249 |
|
15-Apr-2000 |
phk |
Complete the bio/buf divorce for all code below devfs::strategy
Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case.
CCD not converted yet, casts to struct buf (still safe)
atapi-cd casts to struct buf to examine B_PHYS
|
#
58934 |
|
02-Apr-2000 |
phk |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)
Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.
Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack.
Add bio_queue field for struct bio aware disksort.
Address a lot of stylistic issues brought up by bde.
|
#
58820 |
|
30-Mar-2000 |
peter |
Make sysv-style shared memory tuneable params fully runtime adjustable via sysctl. It's done pretty simply but it should be quite adequate. Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that went with it were wrong... we don't allocate KVM space for the pages so that comment is bogus.. The only practical limit is how much physical ram you want to lock up as this stuff isn't paged out or swap backed.
|
#
58706 |
|
27-Mar-2000 |
dillon |
Commit the buffer cache cleanup patch to 4.x and 5.x. This patch fixes a fragmentation problem due to geteblk() reserving too much space for the buffer and imposes a larger granularity (16K) on KVA reservations for the buffer cache to avoid fragmentation issues. The buffer cache size calculations have been redone to simplify them (fewer defines, better comments, less chance of running out of KVA).
The geteblk() fix solves a performance problem that DG was able reproduce.
This patch does not completely fix the KVA fragmentation problems, but it goes a long way
Mostly Reviewed by: bde and others Approved by: jkh
|
#
58345 |
|
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
#
57571 |
|
28-Feb-2000 |
bsd |
Reset the hardware debug registers when exec'ing a new image.
Reviewed by: bde,jlemon Approved by: jkh
|
#
57362 |
|
20-Feb-2000 |
bsd |
Don't forget to reset the hardware debug registers when a process that was using them exits.
Don't allow a user process to cause the kernel to take a TRCTRAP on a user space address.
Reviewed by: jlemon, sef Approved by: jkh
|
#
57178 |
|
13-Feb-2000 |
peter |
Clean up some loose ends in the network code, including the X.25 and ISO #ifdefs. Clean out unused netisr's and leftover netisr linker set gunk. Tested on x86 and alpha, including world.
Approved by: jkh
|
#
54188 |
|
06-Dec-1999 |
luoqi |
User ldt sharing.
|
#
54121 |
|
04-Dec-1999 |
marcel |
oszsigcode -> szosigcode
Pointed out by: bde
|
#
54120 |
|
04-Dec-1999 |
marcel |
Fix type of sf_addr.
Pointed out by: bde
|
#
53648 |
|
23-Nov-1999 |
archie |
Change the prototype of the strto* routines to make the second parameter a char ** instead of a const char **. This make these kernel routines consistent with the corresponding libc userland routines.
Which is actually 'correct' is debatable, but consistency and following the spec was deemed more important in this case.
Reviewed by (in concept): phk, bde
|
#
53624 |
|
23-Nov-1999 |
green |
Fix a confusion between osigcontext and ucontext_t in the previous commit. Since an osigcontext is smaller, if you check for a valid (much larger sized) ucontext_t and it fails, we bogusly would reject the osigcontext as per rev 1.378. Instead, check for osigcontext range validity first, and ucontext_t later. This unbreaks Netscape.
Pointed to the right commit by: peter
|
#
53504 |
|
21-Nov-1999 |
pho |
Moved useracc() to top of sigreturn as to avoid panic caused by invalid arguments to rutine.
Reviewed by: marcel, phk
|
#
53503 |
|
21-Nov-1999 |
phk |
s/p_cred->pc_ucred/p_ucred/g
|
#
53106 |
|
12-Nov-1999 |
marcel |
Change the type of sf_addr in struct {o}sigframe from char* to register_t.
Fix some style bugs and bitrotted comments.
Submitted by: bde
|
#
52720 |
|
31-Oct-1999 |
alc |
The useracc() calls in osigreturn() and sigreturn() should specify VM_PROT_READ rather than VM_PROT_WRITE. (This mistake predates the B_READ/B_WRITE -> VM_PROT_READ/VM_PROT_WRITE change.)
Submitted by: bde
|
#
52644 |
|
30-Oct-1999 |
phk |
Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the "rw" argument, rather than hijacking B_{READ|WRITE}.
Fix two bugs (physio & cam) resulting by the confusion caused by this.
Submitted by: Tor.Egge@fast.no Reviewed by: alc, ken (partly)
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
52452 |
|
24-Oct-1999 |
dillon |
Adjust the buffer cache to better handle small-memory machines. A slightly older version of this code was tested by BDE and I.
Also fixes a lockup situation when kva gets too fragmented.
Remove the maxvmiobufspace variable and sysctl, they are no longer used. Also cleanup (remove) #if 0 sections from prior commits.
This code is more of a hack, but presumably the whole buffer cache implementation is going to be rewritten in the next year so it's no big deal.
|
#
52199 |
|
13-Oct-1999 |
marcel |
Fix a security bug. eflags was copied verbatim from userland.
Submitted by: bde
|
#
52150 |
|
12-Oct-1999 |
marcel |
Now that userland, including modules don't use the osig* syscalls and the kernel itself doesn't use any SYS_osig* constants, change the syscalls to be of type COMPAT.
|
#
52140 |
|
11-Oct-1999 |
luoqi |
Add a per-signal flag to mark handlers registered with osigaction, so we can provide the correct context to each signal handler.
Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK as we did before the linuxthreads support merge (submitted by bde).
Move ps_sigstk from to p_sigacts to the main proc structure since signal stack should not be shared among threads.
Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag. Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag.
Reviewed by: marcel, jdp, bde
|
#
51984 |
|
07-Oct-1999 |
marcel |
Simplification of the signal trampoline and other cleanups.
o Remove unused defines from genassym.c that were needed by the trampoline. o Add load_gs_param function to support.s that catches a fault when %gs is loaded with an invalid descriptor. The function returns EFAULT in that case. o Remove struct trapframe from mcontext_t and replace it with the list of registers. o Modify sendsig and sigreturn accordingly.
This commit contains a patch by bde.
Reviewed by: luoqi, jdp
|
#
51942 |
|
04-Oct-1999 |
marcel |
Re-introduction of sigcontext.
struct sigcontext and ucontext_t/mcontext_t are defined in such a way that both (ie struct sigcontext and ucontext_t) can be passed on to sigreturn. The signal handler is still given a ucontext_t for maximum flexibility.
For backward compatibility sigreturn restores the state for the alternate signal stack from sigcontext.sc_onstack and not from ucontext_t.uc_stack. A good way to determine which value the application has set and thus which value to use, is still open for discussion.
NOTE: This change should only affect those binaries that use sigcontext and/or ucontext_t. In the source tree itself this is only doscmd. Recompilation is required for those applications.
This commit also fixes a lot of style bugs without hopefully adding new ones.
NOTE: struct sigaltstack.ss_size now has type size_t again. For some reason I changed that into unsigned int.
Parts submitted by: bde sigaltstack bug found by: bde
|
#
51908 |
|
03-Oct-1999 |
marcel |
Reinstate the 4th argument to old signal handlers. Don't set it when the handler uses siginfo_t.
|
#
51834 |
|
01-Oct-1999 |
marcel |
Implement the use of si_addr in siginfo_t.
Suggested by: jdp
|
#
51833 |
|
01-Oct-1999 |
marcel |
Don't check %cs *after* it has being set in sigreturn. If the check fails, applications could end up running in kernel mode (oops).
Submitted by: bde
|
#
51792 |
|
29-Sep-1999 |
marcel |
sigset_t change (part 3 of 5) -----------------------------
By introducing a new sigframe so that the signal handler operates on the new siginfo_t and on ucontext_t instead of sigcontext, we now need two version of sendsig and sigreturn.
A flag in struct proc determines whether the process expects an old sigframe or a new sigframe. The signal trampoline handles which sigreturn to call. It does this by testing for a magic cookie in the frame.
The alpha uses osigreturn to implement longjmp. This means that osigreturn is not only used for compatibility with existing binaries. To handle the new sigset_t, setjmp saves it in sc_reserved (see NOTE).
the struct sigframe has been moved from frame.h to sigframe.h to handle the complex header dependencies that was caused by the new sigframe.
NOTE: For the i386, the size of jmp_buf has been increased to hold the new sigset_t. On the alpha this has been prevented by using sc_reserved in sigcontext.
|
#
51065 |
|
07-Sep-1999 |
luoqi |
Save %gs in sigcontext when delivering a signal and restore them upon return (in signal trampoline code). I plan to do the same on -stable, so that we have a consistent interface to userland applications.
Reviewed by: bde
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49952 |
|
17-Aug-1999 |
msmith |
Mindbogglingly, many BIOS vendors expect to be able to load %ds with 0x40 and then access data stored in real-mode segment 0x40, even when called in protected mode. Microsoft unfortunately coddle these individuals, and so must we if we want to run their code.
This change works around GPFs in some APM and PnP BIOS implementations.
Obtained from: Linux
|
#
49558 |
|
09-Aug-1999 |
phk |
Merge the cons.c and cons.h to the best of my ability. alpha may or may not compile, I can't test it.
|
#
49197 |
|
28-Jul-1999 |
msmith |
Major update to the kernel's BIOS-calling ability.
- Add support for calling 32-bit code in other segments - Add support for calling 16-bit protected mode code
Update APM to use this facility.
Submitted by: jlemon
|
#
48918 |
|
19-Jul-1999 |
peter |
Fix a page size vs. KB mixup. The extra buffers allocated at a reduced rate is meant to kick in at 64MB, not 256MB.
Reviewed by: Matthew Dillon <dillon@backplane.com>
|
#
48691 |
|
09-Jul-1999 |
jlemon |
Implement support for hardware debug registers on the i386.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
#
48677 |
|
08-Jul-1999 |
mckusick |
These changes appear to give us benefits with both small (32MB) and large (1G) memory machine configurations. I was able to run 'dbench 32' on a 32MB system without bring the machine to a grinding halt.
* buffer cache hash table now dynamically allocated. This will have no effect on memory consumption for smaller systems and will help scale the buffer cache for larger systems.
* minor enhancement to pmap_clearbit(). I noticed that all the calls to it used constant arguments. Making it an inline allows the constants to propogate to deeper inlines and should produce better code.
* removal of inherent vfs_ioopt support through the emplacement of appropriate #ifdef's, with John's permission. If we do not find a use for it by the end of the year we will remove it entirely.
* removal of getnewbufloops* counters & sysctl's - no longer necessary for debugging, getnewbuf() is now optimal.
* buffer hash table functions removed from sys/buf.h and localized to vfs_bio.c
* VFS_BIO_NEED_DIRTYFLUSH flag and support code added ( bwillwrite() ), allowing processes to block when too many dirty buffers are present in the system.
* removal of a softdep test in bdwrite() that is no longer necessary now that bdwrite() no longer attempts to flush dirty buffers.
* slight optimization added to bqrelse() - there is no reason to test for available buffer space on B_DELWRI buffers.
* addition of reverse-scanning code to vfs_bio_awrite(). vfs_bio_awrite() will attempt to locate clusterable areas in both the forward and reverse direction relative to the offset of the buffer passed to it. This will probably not make much of a difference now, but I believe we will start to rely on it heavily in the future if we decide to shift some of the burden of the clustering closer to the actual I/O initiation.
* Removal of the newbufcnt and lastnewbuf counters that Kirk added. They do not fix any race conditions that haven't already been fixed by the gbincore() test done after the only call to getnewbuf(). getnewbuf() is a static, so there is no chance of it being misused by other modules. ( Unless Kirk can think of a specific thing that this code fixes. I went through it very carefully and didn't see anything ).
* removal of VOP_ISLOCKED() check in flushbufqueues(). I do not think this check is necessary, the buffer should flush properly whether the vnode is locked or not. ( yes? ).
* removal of extra arguments passed to getnewbuf() that are not necessary.
* missed cluster_wbuild() that had to be a cluster_wbuild_wb() in vfs_cluster.c
* vn_write() now calls bwillwrite() *PRIOR* to locking the vnode, which should greatly aid flushing operations in heavy load situations - both the pageout and update daemons will be able to operate more efficiently.
* removal of b_usecount. We may add it back in later but for now it is useless. Prior implementations of the buffer cache never had enough buffers for it to be useful, and current implementations which make more buffers available might not benefit relative to the amount of sophistication required to implement a b_usecount. Straight LRU should work just as well, especially when most things are VMIO backed. I expect that (even though John will not like this assumption) directories will become VMIO backed some point soon.
Submitted by: Matthew Dillon <dillon@backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
|
#
48621 |
|
06-Jul-1999 |
cracauer |
Implement SA_SIGINFO for i386. Thanks to Bruce Evans for much more than a review, this was a nice puzzle.
This is supposed to be binary and source compatible with older applications that access the old FreeBSD-style three arguments to a signal handler.
Except those applications that access hidden signal handler arguments bejond the documented third one. If you have applications that do, please let me know so that we take the opportunity to provide the functionality they need in a documented manner.
Also except application that use 'struct sigframe' directly. You need to recompile gdb and doscmd. `make world` is recommended.
Example program that demonstrates how SA_SIGINFO and old-style FreeBSD handlers (with their three args) may be used in the same process is at http://www3.cons.org/tmp/fbsd-siginfo.c
Programs that use the old FreeBSD-style three arguments are easy to change to SA_SIGINFO (although they don't need to, since the old style will still work):
Old args to signal handler: void handler_sn(int sig, int code, struct sigcontext *scp)
New args: void handler_si(int sig, siginfo_t *si, void *third) where: old:code == new:second->si_code old:scp == &(new:si->si_scp) /* Passed by value! */
The latter is also pointed to by new:third, but accessing via si->si_scp is preferred because it is type-save.
FreeBSD implementation notes: - This is just the framework to make the interface POSIX compatible. For now, no additional functionality is provided. This is supposed to happen now, starting with floating point values. - We don't use 'sigcontext_t.si_value' for now (POSIX meant it for realtime-related values). - Documentation will be updated when new functionality is added and the exact arguments passed are determined. The comments in sys/signal.h are meant to be useful.
Reviewed by: BDE
|
#
48579 |
|
05-Jul-1999 |
msmith |
Move the initialisation/tuning of nmbclusters from param.c/machdep.c into uipc_mbuf.c. This reduces three sets of identical tunable code to one set, and puts the initialisation with the mbuf code proper.
Make NMBUFs tunable as well.
Move the nmbclusters sysctl here as well.
Move the initialisation of maxsockets from param.c to uipc_socket2.c, next to its corresponding sysctl.
Use the new tunable macros for the kern.vm.kmem.size tunable (this should have been in a separate commit, whoops).
|
#
48546 |
|
04-Jul-1999 |
jlemon |
Some cleanup and rearrangement. hw.physmem is now an absolute quantity; we will never use more memory than this value (if specified), but will always check memory for validity up to this amount.
Get rid of the speculative_mprobe option; the memory amount can now be specified by hw.physmem.
|
#
48544 |
|
03-Jul-1999 |
mckusick |
The buffer queue mechanism has been reformulated. Instead of having QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN, QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA. With this patch clean and dirty buffers have been separated. Empty buffers with KVM assignments have been separated from truely empty buffers. getnewbuf() has been rewritten and now operates in a 100% optimal fashion. That is, it is able to find precisely the right kind of buffer it needs to allocate a new buffer, defragment KVM, or to free-up an existing buffer when the buffer cache is full (which is a steady-state situation for the buffer cache).
Buffer flushing has been reorganized. Previously buffers were flushed in the context of whatever process hit the conditions forcing buffer flushing to occur. This resulted in processes blocking on conditions unrelated to what they were doing. This also resulted in inappropriate VFS stacking chains due to multiple processes getting stuck trying to flush dirty buffers or due to a single process getting into a situation where it might attempt to flush buffers recursively - a situation that was only partially fixed in prior commits. We have added a new daemon called the buf_daemon which is responsible for flushing dirty buffers when the number of dirty buffers exceeds the vfs.hidirtybuffers limit. This daemon attempts to dynamically adjust the rate at which dirty buffers are flushed such that getnewbuf() calls (almost) never block.
The number of nbufs and amount of buffer space is now scaled past the 8MB limit that was previously imposed for systems with over 64MB of memory, and the vfs.{lo,hi}dirtybuffers limits have been relaxed somewhat. The number of physical buffers has been increased with the intention that we will manage physical I/O differently in the future.
reassignbuf previously attempted to keep the dirtyblkhd list sorted which could result in non-deterministic operation under certain conditions, such as when a large number of dirty buffers are being managed. This algorithm has been changed. reassignbuf now keeps buffers locally sorted if it can do so cheaply, and otherwise gives up and adds buffers to the head of the dirtyblkhd list. The new algorithm is deterministic but not perfect. The new algorithm greatly reduces problems that previously occured when write_behind was turned off in the system.
The P_FLSINPROG proc->p_flag bit has been replaced by the more descriptive P_BUFEXHAUST bit. This bit allows processes working with filesystem buffers to use available emergency reserves. Normal processes do not set this bit and are not allowed to dig into emergency reserves. The purpose of this bit is to avoid low-memory deadlocks.
A small race condition was fixed in getpbuf() in vm/vm_pager.c.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
|
#
48476 |
|
02-Jul-1999 |
msmith |
Lightly overhaul the memory sizing code again.
- The kernel environment variable 'hw.physmem' can be used to set the amount of physical memory space, based at 0, that FreeBSD will use. Any memory detected over this limit is ignored. Documentation for this is available under 'help set tunables' in the loader.
- In the case where system memory size can't be accurately determined, hw.physmem is used as a best-guess memory size, but speculative probing will be used to determine actual memory size if any of the guesses or hints are 16M or more.
- If RB_VERBOSE, we list the memory regions as we test them.
- The compile-time option MAXMEM supplies a default value for 'hw.physmem'.
|
#
48445 |
|
02-Jul-1999 |
peter |
Zap totally the npx0 memory size override. It only worked if statically specified in the kernel config file - but setting options MAXMEM works exactly the same. Userconfig overrides of this have not worked for ages.
Also, change the getenv for the loader override to hw.physmem based on a prior suggestion from Mike Smith. I think he still wants to change this some, but this shouldn't get in his way. This is a forced setting of the memory size, not a "cap". We probably should have a plain 'maxmem' variable as well which does do a cap, without loosing the bios memory configuration data.
|
#
48405 |
|
01-Jul-1999 |
peter |
Look up the kernel environment for MAXMEM as a final override for the memory size. If somebody wants to change the name, fine - I used this since it's consistant with the config variable it replaces. This is intended to replace the npx0 msize hack (which no longer works).
|
#
48404 |
|
01-Jul-1999 |
peter |
Move kern_envp and preload initialization a little earlier so that we can do a getenv_int() inside the memory sizing routines to override the memory limit.
|
#
48327 |
|
28-Jun-1999 |
luoqi |
Save common_tssd before it's loaded and the busy bit set.
Submitted by: bde
|
#
48203 |
|
24-Jun-1999 |
jlemon |
Fix warning message; that was 4GB, not 2GB. I apparently can't do arithmetic today.
|
#
48202 |
|
24-Jun-1999 |
jlemon |
Explicitly ignore any memory > 2GB, we don't support it yet.
|
#
48005 |
|
18-Jun-1999 |
bde |
Changed the global `idt' from an array to a pointer so that npx.c automatically hacks on the active copy of the IDT if f00f_hack() has changed it. This also allows simplifications in setidt(). This fixes breakage of FP exception handling by rev.1.55 of sys/kernel.h. FP exceptions were sent to npx.c's probe handlers because npx.c "restored" the old handlers to the wrong copy of the IDT. The SYSINIT for f00f_hack() was purposely run quite late to avoid problems like this, but it is bogusly associated with the SYSINIT for proc0 so it was moved with the latter.
Problem reported and fix tested by: Martin Cracauer <cracauer@cons.org>
|
#
47892 |
|
13-Jun-1999 |
alc |
Use pmap_kenter instead of pmap_enter to map the message buffer.
|
#
47862 |
|
10-Jun-1999 |
jlemon |
Change variable used for calculating ending address of physical memory from 'int' to 'vm_offset_t'.
Spotted by: Richard Cownie <tich@ma.ikos.com>
|
#
47688 |
|
01-Jun-1999 |
jlemon |
Unbreak memory sizing for SMP.
|
#
47679 |
|
01-Jun-1999 |
jlemon |
Null commit; note that there is a new memory sizing routine that uses the BIOS calls to determine the memory configuration. This should fix problems with >64M for good.
Reviewed by: Mike Smith
|
#
47678 |
|
01-Jun-1999 |
jlemon |
Unifdef VM86.
Reviewed by: silence on on -current
|
#
47642 |
|
31-May-1999 |
dfr |
Remove fd driver from its old home and change files which include rtc.h to account for its new location.
|
#
47081 |
|
12-May-1999 |
luoqi |
Unbreak VESA on SMP.
|
#
46539 |
|
05-May-1999 |
luoqi |
Initialize dblfault_tss.tss_fs to the per-cpu private data segment selector.
|
#
46537 |
|
05-May-1999 |
luoqi |
Do not set curproc until proc0 is fully initialized (in proc0_init()).
|
#
46129 |
|
27-Apr-1999 |
luoqi |
Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup.
Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
|
#
46089 |
|
26-Apr-1999 |
peter |
Register the netisr's via SYSINIT rather than linker sets.
|
#
45821 |
|
19-Apr-1999 |
peter |
unifdef -DVM_STACK - it's been on for a while for x86 and was checked and appeared to be working for the Alpha some time ago.
|
#
45720 |
|
16-Apr-1999 |
peter |
Bring the 'new-bus' to the i386. This extensively changes the way the i386 platform boots, it is no longer ISA-centric, and is fully dynamic. Most old drivers compile and run without modification via 'compatability shims' to enable a smoother transition. eisa, isapnp and pccard* are not yet using the new resource manager. Once fully converted, all drivers will be loadable, including PCI and ISA.
(Some other changes appear to have snuck in, including a port of Soren's ATA driver to the Alpha. Soren, back this out if you need to.)
This is a checkpoint of work-in-progress, but is quite functional.
The bulk of the work was done over the last few years by Doug Rabson and Garrett Wollman.
Approved by: core
|
#
45270 |
|
03-Apr-1999 |
jdp |
Restore support for executing BSD/OS binaries on the i386 by passing the address of the ps_strings structure to the process via %ebx. For other kinds of binaries, %ebx is still zeroed as before.
Submitted by: Thomas Stephens <tas@stephens.org> Reviewed by: jdp
|
#
44510 |
|
06-Mar-1999 |
wollman |
Expose a slightly-lower-level interface to timeouts which allows callers to manage their own memory. Tested on my machine (make buildworld). I've made analogous changes on the alpha, but don't have a machine to test.
Not-objected-to by: dg, gibbs
|
#
43970 |
|
13-Feb-1999 |
bde |
Don't pass PSL_NT to vm86 signal handlers. Some vm86/real mode programs, including msdos, set PSL_NT in probes for old cpu types, although PSL_NT doesn't do anything useful in vm86 or real mode. PSL_NT is even less useful in the signal handlers. It just causes T_TSSFLT faults on return from syscalls made by the handlers. These faults are fixed up lazily so that Xsyscall() doesn't have to be slowed down to prevent them. The fault handler recently started complaining about these faults occurring "with interrupts disabled". It should not have, but the complaints pointed to this bug.
PR: 9211
|
#
43887 |
|
11-Feb-1999 |
msmith |
Zero p->retval[1] when starting a process. This value ends up in %edx when the process starts, and having it nonzero causes statically-linked Linux binaries to fail.
PR: i386/10015 Submitted by: Marcel Moolenaar <marcel@scc.nl>
|
#
43564 |
|
03-Feb-1999 |
dg |
Fixed the type of target_page to vm_offset_t (unsigned). This fixes a panic during boot on machines with >=2GB of RAM. Also changed some incorrect printf conversion specifiers from %d to %u (signed to unsigned). This fixes bugs when printing the amount of memory on machines with >=2GB of RAM.
|
#
43340 |
|
28-Jan-1999 |
newton |
Sun Bug ID 1251858 (on http://sunsolve1.sun.com) discusses the way that Sun implemented iBCS2 compatibility on Solaris >= 2.6: The emulator runs in user-mode, patching the LDT so that client programs making syscalls through the old iBCS2 call gate get handled by the emulator process. Unemulated syscalls therefore need their own call-gate that bypasses the emulator. Sun chose LDT entry 4 to implement this, which is what we've been using as LUDATA_SEL, so we need to change LUDATA_SEL if we want to run Solaris executables.
Discussed with: Mike Smith
|
#
42705 |
|
15-Jan-1999 |
msmith |
Fetch an overide for NMBCLUSTERS from the kernel environment. Never allow the value to be reduced below that defined when the kernel was built.
|
#
42437 |
|
09-Jan-1999 |
bde |
Fixed switching between consoles (sc0, vt0 or sioN) in userconfig.
Broken in: rev.1.315
|
#
42360 |
|
06-Jan-1999 |
julian |
Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port.
The code is conditional on VM_STACK. Not using this will produce the old heavily tested system.
Submitted by: Richard Seaman <dick@tar.com>
|
#
41871 |
|
16-Dec-1998 |
bde |
Removed the cast to a pointer in the definition of PS_STRINGS and adjusted related casts to match (only in the kernel in this commit). The pointer was only wanted in one place in kern_exec.c. Applications should use the kern.ps_strings sysctl instead of PS_STRINGS, so they shouldn't notice this change.
|
#
41629 |
|
09-Dec-1998 |
steve |
Cleanup up the wording for the F00F bug workaround message.
PR: 8041 Submitted by: Dan Nelson <dnelson@emsphone.com>
|
#
41454 |
|
02-Dec-1998 |
kato |
- For some old Cyrix CPUs, %cr2 is clobbered by interrupts. This problem is worked around by using an interrupt gate for the page fault handler. This code was originally made for NetBSD/pc98 by Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp> and has already been in PC98 tree. Because of this bug, trap_fatal cannot show correct page fault address if %cr2 is obtained in this function. Therefore, trap_fatal uses the value from trap() function. - The trap handler always enables interruption when buggy application or kernel code has disabled interrupts and then trapped. This code was prepared by Bruce Evans <bde@FreeBSD.org>.
Submitted by: Bruce Evans <bde@FreeBSD.org> Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp>
|
#
41362 |
|
26-Nov-1998 |
eivind |
Staticize.
|
#
40866 |
|
03-Nov-1998 |
msmith |
Remove the USERCONFIG_BOOT option. Userconfig script data is searched for in a loaded module of type "userconfig_script". The RB_CONFIG flag will always result in the user being left inside userconfig at the end of the script's execution, regardless of 'quit' commands in the script. If the RB_CONFIG flag is not specified, the user will never be left inside userconfig, even if the script does not have an explicit exit command.
Add the INTRO_USERCONFIG option. This option forces the userconfig 'intro' screen (after a script has optionally been executed). There is no longer a need to queue an 'intro' command.
|
#
40751 |
|
30-Oct-1998 |
msmith |
Add the ability to specify where on the at_shutdown queue a handler is installed.
Remove cpu_power_down, and replace it with an entry at the end of the SHUTDOWN_FINAL queue in the only place it's used (APM).
Submitted by: Some ideas from Bruce Walter <walter@fortean.com>
|
#
40152 |
|
09-Oct-1998 |
peter |
Relocate the preload module info from machdep specifically rather than trying to do it in locore. We also walk through the module table and relocate any MODINFO_ADDR pointers so that they become KVM relative rather than physical addresses. This means that hacks for adding 0xf0000000 in places like MFS go away.
|
#
40089 |
|
08-Oct-1998 |
msmith |
Initialise kernel environment and module metadata pointers.
|
#
39760 |
|
29-Sep-1998 |
abial |
Add sysctl 'machdep.msgbuf_clear'. Setting it to anything causes the kernel message buffer to be cleared. It comes handy in situations when the only logging facility you have is the msgbuf.
Reviewed by: jkh
|
#
39648 |
|
25-Sep-1998 |
peter |
Goodbye BOUNCE_BUFFERS, for a hack it has served us well.
The last consumer of this code (the old SCSI system) has left us and the CAM code does it's own bouncing. The isa dma system has been doing it's own bouncing for a while too.
Reviewed by: core
|
#
39197 |
|
14-Sep-1998 |
jdp |
Add new functions fill_fpregs() and set_fpregs(), like fill_regs() and set_regs() but for the floating point register state. The code is stolen from procfs_machdep.c, and moved out of there into machdep.c.
These functions are needed for generating ELF core dumps.
|
#
39176 |
|
14-Sep-1998 |
abial |
This implements retrieving the contents of message buffer via sysctl(3) as "machdep.msgbuf". It's needed in case of using stripped kernels, where normal dmesg (which has to use kvm) doesn't work.
The buffer is unwound, meaning that the data will be linear, possibly with some leading NULLs.
Reviewed by: Jordan K. Hubbard <jkh@freebsd.org>
|
#
38717 |
|
01-Sep-1998 |
kato |
- Fix style bug. - hw.ispc98 -> machdep.ispc98.
Submitted by: Garrett Wollman (hw -> machdep)
|
#
38700 |
|
31-Aug-1998 |
luoqi |
Use 16bit register in inline asm code to set segment registers.
|
#
38673 |
|
31-Aug-1998 |
kato |
- hw.machine_arch returns cpu architecture type. - moved definition of MACHINE_ARCH from cpu.h to parm.h as alpha. - Added definitions of _MACHINE and _MACHINE_ARCH. - Added hw.ispc98. The hw.ispc98 is 1 in PC98 kernel and is 0 in IBM-PC kernel.
Discussed with: John Birrell <jb@FreeBSD.ORG>
|
#
38422 |
|
18-Aug-1998 |
msmith |
Presently there is only one `currentldt' variable for all cpus in a SMP system. Unexpected things could happen if each cpu has a different ldt setting and one cpu tries to use value of currentldt set by another cpu.
The fix is to move currentldt to the per-cpu area. It includes patches I filed in PR i386/6219 which are also user ldt related.
PR: i386/7591, i386/6219 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
|
#
37555 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37315 |
|
30-Jun-1998 |
phk |
Add 3 sysctl variables for future use by ps)1_
|
#
37099 |
|
21-Jun-1998 |
bde |
Removed unused includes. Ifdefed conditionally used includes.
|
#
37034 |
|
17-Jun-1998 |
bde |
Don't declare isa device structs or isa interrupt handlers in <sys/conf>, and don't depend on them being declared there. This will cause lots of warnings for a few minutes until config is updated. Interrupt handlers should never have been configured by config, and the machine generated declarations get in the way of changing the arg type from int to void *.
|
#
36735 |
|
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
36605 |
|
03-Jun-1998 |
bde |
Ifdefed the netisr support.
PR: 6760 Reviewed by: joerg
|
#
36441 |
|
28-May-1998 |
phk |
Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.
Clean up (or if antipodic: down) some of the msgbuf stuff.
Use an inline function rather than a macro for timecounter delta.
Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead.
Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch()
This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems.
WARNING: Programs which muck about with struct proc in userland will have to be fixed.
Reviewed, but found imperfect by: bde
|
#
36179 |
|
19-May-1998 |
phk |
Make the size of the msgbuf (dmesg) a "normal" option.
|
#
36168 |
|
18-May-1998 |
tegge |
Disallow reading the current kernel stack. Only the user structure and the current registers should be accessible. Reviewed by: David Greenman <dg@root.com>
|
#
35076 |
|
06-Apr-1998 |
peter |
clean up #ifdefs, define the variables that have to be per-cpu on SMP in globals.s only and use externs always.
|
#
34840 |
|
23-Mar-1998 |
jlemon |
Add the ability to make real-mode BIOS calls from the kernel. Currently, everything is contained inside #ifdef VM86, so this option must be present in the config file to use this functionality.
Thanks to Tor Egge, these changes should work on SMP machines. However, it may not be throughly SMP-safe.
Currently, the only BIOS calls made are memory-sizing routines at bootup, these replace reading the RTC values.
|
#
34197 |
|
07-Mar-1998 |
tegge |
The APs now reload the interrupt descriptor table pointer after f00f_hack has run.
Use the global r_idt descriptor in f00f_hack when in SMP mode, so the APs find the relocated interrupt descriptor table.
Submitted by: Partially from David A Adkins <adkin003@tc.umn.edu>
|
#
34057 |
|
05-Mar-1998 |
tegge |
Use t_idt instead of idt inside setidt() if f00f_hack() has relocated the IDT. Submitted by: Bruce Evans <bde@zeta.org.au>
|
#
33983 |
|
02-Mar-1998 |
peter |
Update the ELF image activator to use some of the exec resources rather than rolling it's own. This means that it now uses the "safe" exec_map_first_page() to get the ld.so headers rather than risking a panic on a page fault failure (eg: NFS server goes down). Since all the ELF tools go to a lot of trouble to make sure everything lives in the first page for executables, this is a win. I have not seen any ELF executable on any system where all the headers didn't fit in the first page with lots of room to spare. I have been running variations of this code for some time on my pure ELF systems.
|
#
33179 |
|
09-Feb-1998 |
eivind |
Remove warnings from f00f_hack.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
33051 |
|
03-Feb-1998 |
bde |
Ifdefed some SMP and VM86 code. Note that although VM86 is not a global option, the ifdef on it in a header works because only the name of the VM86 extension is hidden.
|
#
32884 |
|
30-Jan-1998 |
dyson |
Make the bounce buffer code a little more robust when space isn't available. If there isn't bounce space available, the bounce code is disabled. This will allow most large systems to run properly when the bounce space is mistakenly allocated above 16MB.
|
#
32765 |
|
25-Jan-1998 |
kato |
Even though BIOS writer's guide recommends cpuid instruction of Cyrix 6x86MX CPU is enabled (BIOS should not disable it), some BIOS disables it via CCR4. In this case, cpu variable becomes CPU_486 and identblue() is called. Because Cyrix 6x86MX has MSR and doesn't have MSR1002, wrmsr instruction generates general protection fault.
Tested by: Simon Coggins <chaos@ultra.net.au>
|
#
32724 |
|
24-Jan-1998 |
dyson |
Add better support for larger I/O clusters, including larger physical I/O. The support is not mature yet, and some of the underlying implementation needs help. However, support does exist for IDE devices now.
|
#
32702 |
|
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
32464 |
|
12-Jan-1998 |
dyson |
Adjust upwards the size of exec map in order to take into account the additional PAGE_SIZE needed for exec operatino.
|
#
32010 |
|
27-Dec-1997 |
peter |
#include "opt_user_ldt.h" so that the #ifdef USER_LDT checks can work, as commented about at length in the PR audit trail.
PR: 2412
|
#
31709 |
|
14-Dec-1997 |
dyson |
After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush.
Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.
|
#
31544 |
|
04-Dec-1997 |
jmg |
document and make the NO_F00F_HACK a proper option...
also, sort some option includes while I'm here..
Forgotten by: sef
|
#
31535 |
|
04-Dec-1997 |
jkh |
After consultation with David, change #ifndef NO_F00F_HACK to #if defined(I586_CPU) && !defined(NO_F00F_HACK)
|
#
31515 |
|
03-Dec-1997 |
sef |
Make has_f00f_bug extern, and get rid of some unused code in the f00f code.
Submitted by: Mikael Karpberg & Cy Schubert
|
#
31507 |
|
03-Dec-1997 |
sef |
Work around for the Intel Pentium F00F bug; this is Intel's recommended workaround. Note that this currently eats up two pages extra in the system; this could be alleviated by aligning idt correctly, and then only dealing with that (as opposed to the current method of allocated two pages and copying the IDT table to that, and then setting that to be the IDT table).
|
#
31397 |
|
24-Nov-1997 |
bde |
Fixed multiple definitions of boothowto.
|
#
31337 |
|
21-Nov-1997 |
bde |
Fixed setting of `safepri'. It should be SWI_AST_MASK most of the time, but was left at 0. This caused the "can't happen" case in splz_swi to happen for panics when tsleep() calls splx(safepri) and there is a SWI_AST pending. This was harmless because the the error handling happens to be right. Debugging this was tricky because debugger traps force SWI_AST_MASK on in `cpl'.
|
#
31321 |
|
20-Nov-1997 |
bde |
Moved some extern declarations to header files (unused ones to /dev/null).
|
#
31017 |
|
07-Nov-1997 |
phk |
Rename some local variables to avoid shadowing other local variables.
Found by: -Wshadow
|
#
30994 |
|
06-Nov-1997 |
phk |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead.
This fixes a boatload of compiler warning, and removes a lot of cruft from the sources.
I have not removed the /*ARGSUSED*/, they will require some looking at.
libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
#
30354 |
|
12-Oct-1997 |
phk |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them.
A couple of finer points by: bde
|
#
30275 |
|
10-Oct-1997 |
peter |
Compensate for pcb.h tweaks.
(Bruce pointed out the nesting)
|
#
30265 |
|
10-Oct-1997 |
peter |
Convert the VM86 option from a global option to an option only depended on by the files that use it. Changing the VM86 option now only causes a recompile of a dozen files or so rather than the entire kernel.
|
#
29851 |
|
25-Sep-1997 |
dg |
Fix a bug where the speculative memory probe wouldn't occur on systems that report slightly more than 64MB of total memory. This can happen due to the total being the sum of both base and extended memory. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
29675 |
|
21-Sep-1997 |
gibbs |
autoconf.c: Add cpu_rootconf and cpu_dumpconf so that configuring these two devices can be better controlled by the MI configuration code.
machdep.c: MD initialization code for the new callout interface.
trap.c: Add support for printing out whether cam interrupts are masked during a panic.
|
#
29663 |
|
21-Sep-1997 |
peter |
Implement the parts needed for VM86 under SMP.
|
#
29110 |
|
04-Sep-1997 |
dg |
Cosmetic change to last commit: speculative_mtest -> speculative_mprobe.
|
#
29109 |
|
04-Sep-1997 |
dg |
Changed the memory sizing code so that if the following conditions are met:
1) The BIOS indicates that there is exactly 64MB of RAM, and 2) The memory size isn't specified with the MAXMEM option or the npx0 msize hack,
...then do a speculative memory probe beyond the 64MB's until the first bad page is encountered. This is an admitted hack, but should nonetheless deal with detecting the correct amount of memory in nearly all of the modern systems with >64MB of RAM. Also made a change that will cause the list of detected memory chunks to be printed if bootverbose is set.
|
#
29041 |
|
02-Sep-1997 |
bde |
Removed unused #includes.
|
#
28984 |
|
31-Aug-1997 |
bde |
Move closer to supporting VM86 under SMP.
LINT now compiles but doesn't link. Other link-time breakage for LINT is now visible (SMP is incompatible with SIMPLELOCK_DEBUG). Submitted by: jlemon
|
#
28976 |
|
31-Aug-1997 |
bde |
Fixed options SHOW_BUSYBUFS and PANIC_REBOOT_WAIT_TIME which were broken by incomplete cutting and pasting from machdep.c to kern_shutdown.c.
PR: 3953
|
#
28808 |
|
26-Aug-1997 |
peter |
Clean up the SMP AP bootstrap and eliminate the wretched idle procs.
- We now have enough per-cpu idle context, the real idle loop has been revived (cpu's halt now with nothing to do). - Some preliminary support for running some operations outside the global lock (eg: zeroing "free but not yet zeroed pages") is present but appears to cause problems. Off by default. - the smp_active sysctl now behaves differently. It's merely a 'true/false' option. Setting smp_active to zero causes the AP's to halt in the idle loop and stop scheduling processes. - bootstrap is a lot safer. Instead of sharing a statically compiled in stack a number of times (which has caused lots of problems) and then abandoning it, we use the idle context to boot the AP's directly. This should help >2 cpu support since the bootlock stuff was in doubt. - print physical apic id in traps.. helps identify private pages getting out of sync. (You don't want to know how much hair I tore out with this!)
More cleanup to follow, this is more of a checkpoint than a 'finished' thing.
|
#
28496 |
|
21-Aug-1997 |
charnier |
Revert my previous commit about using CS_SECURE macro. Requested by: Bruce.
|
#
28359 |
|
18-Aug-1997 |
charnier |
Use CS_SECURE macro. Reviewed by: John Dyson
|
#
27993 |
|
08-Aug-1997 |
dyson |
VM86 kernel support. Work done by BSDI, Jonathan Lemon <jlemon@americantv.com>, Mike Smith <msmith@gsoft.com.au>, Sean Eric Fagan <sef@kithrup.com>, and probably alot of others. Submitted by: Jnathan Lemon <jlemon@americantv.com>
|
#
27899 |
|
04-Aug-1997 |
dyson |
Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.
|
#
27535 |
|
20-Jul-1997 |
bde |
Removed unused #includes.
|
#
26994 |
|
27-Jun-1997 |
fsmp |
Removed '#include <machine/smptests.h>' line, no longer needed.
|
#
26812 |
|
22-Jun-1997 |
peter |
Preliminary support for per-cpu data pages.
This eliminates a lot of #ifdef SMP type code. Things like _curproc reside in a data page that is unique on each cpu, eliminating the expensive macros like: #define curproc (SMPcurproc[cpunumber()])
There are some unresolved bootstrap and address space sharing issues at present, but Steve is waiting on this for other work. There is still some strictly temporary code present that isn't exactly pretty.
This is part of a larger change that has run into some bumps, this part is standalone so it should be safe. The temporary code goes away when the full idle cpu support is finished.
Reviewed by: fsmp, dyson
|
#
26811 |
|
22-Jun-1997 |
peter |
Kill some stale leftovers from the earlier attempts at SMP per-cpu pages
|
#
26659 |
|
15-Jun-1997 |
wollman |
Fix another power down braino.
|
#
26657 |
|
15-Jun-1997 |
wollman |
When APM is configured, turn off the power when halting for good.
|
#
26494 |
|
07-Jun-1997 |
bde |
Preserve %fs and %gs across context switches. This has a relatively low cost since it is only done in cpu_switch(), not for every exception. The extra state is kept in the pcb, and handled much like the npx state, with similar deficiencies (the state is not preserved across signal handlers, and error handling loses state).
|
#
26373 |
|
02-Jun-1997 |
dfr |
Move interrupt handling code from isa.c to a new file. This should make isa.c (slightly) more portable and will make my life developing the really portable version much easier.
Reviewed by: peter, fsmp
|
#
26171 |
|
26-May-1997 |
fsmp |
Fix breakage from my last commit where mp_start() was missing from UP builds.
|
#
26155 |
|
26-May-1997 |
fsmp |
Added a test called 'LATE_START'.
This is now the default, it delays most of the MP startup to the function machdep.c:cpu_startup(). It should be possible to move the 2 functions found there (mp_start() & mp_announce()) even further down the path once we know exactly where that should be...
Help from: Peter Wemm <peter@spinner.dialix.com.au>
|
#
26102 |
|
24-May-1997 |
fsmp |
Delay mp_start() till after the msgbuf is mapped. We really want to delay it till even later but tss setup prevents that right now...
|
#
25985 |
|
21-May-1997 |
jdp |
This commit affects ELF kernels only.
Remove "setdefs.h" and arrange to generate it automatically at ELF kernel build time.
"gensetdefs.c" is a utility which scans a set of ELF object files and outputs a line ``DEFINE_SET(name, length);'' for each linker set that it finds. When generating an ELF kernel, this is run just before the final link to generate "setdefs.h".
Remove the init_sets() function from "setdef0.c", and its call from "machdep.c". Since "gensetdefs.c" calculates the length of each set, it is no longer necessary in an ELF kernel to count the set elements at kernel initialization time. Also remove "set_of_sets" which was used for this purpose.
Link "setdef0" and "setdef1" into the kernel only if building for ELF. Since init_sets() is no longer used, there is no need to link them into an a.out kernel.
|
#
25711 |
|
11-May-1997 |
bde |
Fixed initialization of ldt[]. Unused entries were garbage. A comment was stale.
Fixed initialization of gdt[] for the BDE_DEBUGGER case. APM entries clobbered debugger entries if the debugger was loaded (APM is incompatible with BDE_DEBUGGER) and unused entries were garbage if the debugger wasn't loaded.
|
#
25556 |
|
07-May-1997 |
peter |
md_regs is struct trapframe * now, rather than int [] Remove TF_REGP() macro and use. The original reason (address space problems due to having UPAGES in mapped into user space) is gone. It looks cleaner without it.
|
#
25164 |
|
26-Apr-1997 |
peter |
Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to come over the next few days.
The kernel should run pretty much "as before" without the options to activate SMP mode.
There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment.
This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
25083 |
|
22-Apr-1997 |
jdp |
Make the necessary changes so that an ELF kernel can be built. I have successfully built, booted, and run a number of different ELF kernel configurations, including GENERIC. LINT also builds and links cleanly, though I have not tried to boot it.
The impact on developers is virtually nil, except for two things. All linker sets that might possibly be present in the kernel must be listed in "sys/i386/i386/setdefs.h". And all C symbols that are also referenced from assembly language code must be listed in "sys/i386/include/asnames.h". It so happens that failure to do these things will have no impact on the a.out kernel. But it will break the build of the ELF kernel.
The ELF bootloader works, but it is not ready to commit quite yet.
|
#
24852 |
|
13-Apr-1997 |
dyson |
Decrease the amount of memory allocated for bouncing. This will allow large systems to boot successfully with bounce buffers compiled in. We are now limiting bounce space to 512K. The 8MB allocated for a 512MB system is very bogus -- and that is now fixed.
|
#
24691 |
|
07-Apr-1997 |
peter |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
24690 |
|
07-Apr-1997 |
peter |
No longer use an i386tss as the basis of our pcb - it wasn't particularly convenient and makes life difficult for my next commit. We still need an i386tss to point to for the tss slot in the gdt, so we use a common tss shared between all processes.
Note that this is going to break debugging until this series of commits is finished. core dumps will change again too. :-( we really need a more modern core dump format that doesn't depend on the pcb/upages.
This change makes VM86 mode harder, but the following commits will remove a lot of constraints for the VM86 system, including the possibility of extending the pcb for an IO port map etc.
Obtained from: bde
|
#
24437 |
|
31-Mar-1997 |
dg |
Changed the way that the exec image header is read to be filesystem- centric rather than VM-centric to fix a problem with errors not being detectable when the header is read. Killed exech_map as a result of these changes. There appears to be no performance difference with this change.
|
#
24342 |
|
28-Mar-1997 |
joerg |
Something long overdue: compile inb() and outb() into the kernel as functions if DDB is available. The remaining occurences are usually only inlined and thus not available in DDB.
I'm sure Bruce will have 23 additions to these 30 lines of code, but at least it's a starting point. ;-)
|
#
24283 |
|
25-Mar-1997 |
mpp |
Change sigreturn() to return EFAULT if it is passed an address outside of the process's address space. Now it matches its man page :-). Closes PR# 2682.
Discussed with: bde Submitted by: Jonathan Lemon <jlemon@americantv.com>
|
#
24203 |
|
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include it when it is not used. In most cases, the reasons for including it went away when the special ioctl headers became self-sufficient.
|
#
24112 |
|
22-Mar-1997 |
kato |
Improved CPU identification and initialization routines. This supports All Cyrix CPUs, IBM Blue Lightning CPU and NexGen (now AMD) Nx586 CPU, and initialize special registers of Cyrix CPU and msr of IBM Blue Lightning CPU.
If revision of Cyrix 6x86 CPU < 2.7, CPU cache is enabled in write-through mode. This can be disabled by kernel configuration options.
Reviewed by: Bruce Evans <bde@freebsd.org> and Jordan K. Hubbard <jkh@freebsd.org>
|
#
23070 |
|
23-Feb-1997 |
alex |
Typo police.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21975 |
|
24-Jan-1997 |
bde |
Initialize CR0_MP in setregs() in case npx0 is disabled or not configured. Disabling npx0 works right now.
Don't reference `npxdriver' if npx0 is not configured. Not configuring npx0 doesn't quite work yet.
Don't clear potential non-npx pcb flags in setregs().
|
#
21737 |
|
15-Jan-1997 |
dg |
Fix bug related to map entry allocations where a sleep might be attempted when allocating memory for network buffers at interrupt time. This is due to inadequate checking for the new mcl_map. Fixed by merging mb_map and mcl_map into a single mb_map.
Reviewed by: wollman
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
20998 |
|
29-Dec-1996 |
dyson |
Superficial clean-up of useracc calls. (The useracc usage of B_READ/B_WRITE is bogus anyway.) Might as well make the call prettier anyway.
|
#
20641 |
|
18-Dec-1996 |
bde |
Moved the printing of the BIOS geometries from cpu_startup() into configure() where it always belonged. It was originally slightly misplaced after configure(). Rev.138 left it completely misplaced before the DEVFS, DRIVERS and CONFIGURE sysinits by not moving it together with configure().
Restored the printing of bootinfo.bi_n_bios_used now that it can be nonzero.
|
#
20578 |
|
17-Dec-1996 |
dg |
Fix nbuf calculation /4 -> /8. 2.2 already has it this way.
Reviewed by: dyson
|
#
20471 |
|
14-Dec-1996 |
jkh |
Make the USERCONFIG_BOOT semantics closer to what was original intended.
|
#
20348 |
|
12-Dec-1996 |
dg |
Fix allocation for exech_map to be 16*PAGE_SIZE rather than 32*PAGE_SIZE so that it is scaled the same as exec_map (16 concurrent exec'ers).
|
#
20313 |
|
11-Dec-1996 |
dyson |
One minor mod to set the limit of nbufs to 2048 from 1536. More important fix to exech_map, it used 32*ARG_MAX, and it should use 32*PAGE_SIZE.
|
#
20146 |
|
05-Dec-1996 |
dyson |
Clean-up of the new buffer kva allocation code. Also, there was an error in the !BOUNCE_BUFFERS case.
|
#
20070 |
|
01-Dec-1996 |
bde |
Removed all references to b_cylinder (aka b_cylin). It was evil and hasn't been used for a year or two since disksort() started sorting on b_pblkno.
|
#
20068 |
|
01-Dec-1996 |
dyson |
Fix a problem with the new buffer_map management code. Additionally, decrease the size of buffer_map to approx 2/3 of what it used to be (buffer_map can be smaller now.) The original commit of these changes increased the size of buffer_map to the point where the system would not boot on large systems -- now large systems with large caches will have even less problems than before.
|
#
20017 |
|
29-Nov-1996 |
bde |
Don't print bootinfo.bi_n_bios_used in cpu_startup() since it is always zero because no drivers have had a chance to change it.
|
#
20016 |
|
29-Nov-1996 |
bde |
Don't clobber the SIGCONT bit in the signal mask in sigreturn(). Use the `sigcantmask' macro to get the correct set of unmaskable signals.
Found by: NIST-PCTS.
|
#
19828 |
|
17-Nov-1996 |
dyson |
Improve the caching of small files like directories, while not substantially increasing buffer space. Specifically, we double the number of buffers, but allocate only half the amount of memory per buffer. Note that VDIR files aren't cached unless instantiated in a buffer. This will significantly improve caching.
|
#
19653 |
|
11-Nov-1996 |
bde |
Replaced I586_OPTIMIZED_BCOPY and I586_OPTIMIZED_BZERO with boot-time negative-logic flags (flags 0x01 and 0x02 for npx0, defaulting to unset = on). This changes the default from off to on. The options have been in current for several months with no problems reported.
Added a boot-time negative-logic flag for the old I5886_FAST_BCOPY option which went away too soon (flag 0x04 for npx0, defaulting to unset = on).
Added a boot-time way to set the memory size (iosiz in config, iosize in userconfig for npx0).
LINT: Removed old options. Documented npx0's flags and iosiz.
options.i386: Removed old options.
identcpu.c: Don't set the function pointers here. Setting them has to be delayed until after userconfig has had a chance to disable them and until after a good npx0 has been detected.
machdep.c: Use npx0's iosize instead of MAXMEM if it is nonzero.
support.s: Added vectors and glue code for copyin() and copyout(). Fixed ifdefs for i586_bzero(). Added ifdefs for i586_bcopy().
npx.c: Set the function pointers here. Clear hw_float when an npx exists but is too broken to use. Restored style from a year or three ago in npxattach().
|
#
19503 |
|
07-Nov-1996 |
joerg |
Fix the message buffer mapping. This actually allows to increase the message buffer size in <sys/msgbuf.h>.
Reviewed by: davidg,joerg Submitted by: bde
|
#
19274 |
|
30-Oct-1996 |
julian |
Further improved version of hadling a HALT when there is no console.
|
#
19064 |
|
20-Oct-1996 |
phk |
Removing old isdn stuff.
|
#
18702 |
|
05-Oct-1996 |
jkh |
Multiple changes stacked as one commit since they all depend on one another.
First, change sysinstall and the Makefile rules to not build the kernel nlist directly into sysinstall now. Instead, spit it out as an ascii file in /stand and parse it from sysinstall later. This solves the chicken-n- egg problem of building sysinstall into the fsimage before BOOTMFS is built and can have its symbols extracted. Now we generate the symbol file in release.8.
Second, add Poul-Henning's USERCONFIG_BOOT changes. These have two effects:
1. Userconfig is always entered, rather than only after a -c (don't scream yet, it's not as bad as it sounds).
2. Userconfig reads a message string which can optionally be written just past the boot blocks. This string "preloads" the userconfig input buffer and is parsed as user input. If the first command is not "USERCONFIG", userconfig will treat this as an implied "quit" (which is why you don't need to scream - you never even know you went through userconfig and back out again if you don't specifically ask for it), otherwise it will read and execute the following commands until a "quit" is seen or the end is reached, in which case the normal userconfig command prompt will then be presented.
How to create your own startup sequences, using any boot.flp image from the next snap forward (not yet, but soon):
% dd of=/dev/rfd0 seek=1 bs=512 count=1 conv=sync <<WAKKA_WAKKA_DOO USERCONFIG irq ed0 10 iomem ed0 0xcc000 disable ed1 quit WAKKA_WAKKA_DOO
Third, add an intro screen to UserConfig so that users aren't just thrown into this strange screen if userconfig is auto-launched. The default boot.flp startup sequence is now, in fact, this:
USERCONFIG intro visual
(Since visual never returns, we don't need a following "quit").
Submitted-By: phk & jkh
|
#
18548 |
|
28-Sep-1996 |
dyson |
Essentially rename pmap_update to be invltlb. It is a very machine dependent operation, and not really a correct name. invltlb and invlpg are more descriptive, and in the case of invlpg, a real opcode.
Additionally, fix the tlb management code for 386 machines.
|
#
18514 |
|
27-Sep-1996 |
peter |
part 2 of the bsdi compat tweak attempt. I believe that BSDI use both lcall 7,0 (ie: ldt slot 0) and lcall 0x87,0 (ldt slot 16, it's shifted three bits to the left). I was fiddling with this so long ago, I don't recall the specifics.
|
#
18252 |
|
11-Sep-1996 |
phk |
Make userconfig two (default: on) options: USERCONFIG to enable VISUAL_USERCONFIG to get the gui stuff too. Requested by: pst
|
#
18232 |
|
10-Sep-1996 |
bde |
Removed bogus LARGMEM code and option. The code paniced when biosextmem > 65536, but biosextmem is a 16-bit quantity so it is guaranteed to be < 65536. Related cruft for biosbasemem was mostly cleaned up in rev.1.26.
|
#
18084 |
|
06-Sep-1996 |
phk |
Remove devconf, it never grew up to be of any use.
|
#
18023 |
|
03-Sep-1996 |
nate |
Cleaned up version of my 'extended BIOS' patch. This one is commented better and much simpler to understand, and works just as well (better) as a bonus.
Submitted by: bde
|
#
17983 |
|
01-Sep-1996 |
nate |
If the basemem value supplied by the bootblocks, differs from the value returned by the RTC, use the bootblock supplied value. Also, map the 'stolen by BIOS' memory in the same manner as the ISA-hole memory, since it is really an extenstion of the BIOS. This is necessary for 32-bit BIOS functions such as APM support on laptops, and the loss of memory for non-necessary functions seems to be at most 4k.
Reviewed by: phk Obtained from: email conversation with jtk@atria.com
|
#
17677 |
|
19-Aug-1996 |
julian |
Collect all the functioons concerned with rebooting into one place also add the at_shutdown callout list, and change the one user of the present (broken) method (the vn driver) to use the new scheme.
|
#
17559 |
|
12-Aug-1996 |
wollman |
Back out mistaken local change that sneaked in on the last commit.
|
#
17558 |
|
12-Aug-1996 |
wollman |
Don't declare the user_ldt functions unless USER_LDT is defined. Eliminates an obnoxious warning.
|
#
17521 |
|
11-Aug-1996 |
dg |
Add support for i686 machine check trap.
|
#
17118 |
|
12-Jul-1996 |
bde |
Export `dumpmag' to utilities but not to the kernel.
Restored a truncated comment.
|
#
17014 |
|
08-Jul-1996 |
wollman |
Fix something that's been bugging me for a long time: move the CPU type identification code out of machdep.c and into a new file of its own. Hopefully other grot can be moved out of machdep.c as well (by other people) into more descriptively-named files.
|
#
16471 |
|
17-Jun-1996 |
bde |
Removed unused #includes of <i386/isa/icu.h> and <i386/isa/icu.h>. icu.h is only used by the icu support modules and by a few drivers that know too much about the icu (most only use it to convert `n' to `IRQn'). isa.h is only used by ioconf.c and by a few drivers that know too much about isa addresses (a few have to, because config is deficient).
|
#
16215 |
|
08-Jun-1996 |
bde |
Stop using the alias `pcb_ptd' for `pcb_tcc.tss_cr3'. Use the (existing) alias `pcb_cr3' instead. That is still one alias too many, but is convenient for me since I've replaced the tss in the pcb by a few scalar variables in the pcb.
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
15722 |
|
10-May-1996 |
wollman |
Allocate mbufs from a separate submap so that NMBCLUSTERS works as expected.
|
#
15583 |
|
03-May-1996 |
phk |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
15565 |
|
02-May-1996 |
phk |
Move atdevbase out of locore.s and into machdep.c Macroize locore.s' page table setup even more, now it's almost readable. Rename PG_U to PG_A (so that I can...) Rename PG_u to PG_U. "PG_u" was just too ugly... Remove some unused vars in pmap.c Remove PG_KR and PG_KW Remove SSIZE Remove SINCR Remove BTOPKERNBASE
This concludes my spring cleaning, modulus any bug fixes for messes I have made on the way.
(Funny to be back here in pmap.c, that's where my first significant contribution to 386BSD was... :-)
|
#
15543 |
|
02-May-1996 |
phk |
removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG
Major macro cleanup.
|
#
15508 |
|
01-May-1996 |
bde |
Added calibration the i8254 and the i586 clocks agains the RTC at boot time. The results are currently ignored unless certain temporary options are used.
Added sysctls to support reading and writing the clock frequency variables (not the frequencies themselves). Writing is supposed to atomically adjust all related variables.
machdep.c: Fixed spelling of a function name in a comment so that I can log this message which should have been with the previous commit.
Initialize `cpu_class' earlier so that it can be used in startrtclock() instead of in calibrate_cyclecounter() (which no longer exists).
Removed range checking of `cpu'. It is always initialized to CPU_XXX so it is less likely to be out of bounds than most variables.
clock.h: Removed I586_CYCLECTR(). Use rdtsc() instead.
clock.c: TIMER_FREQ is now a variable timer_freq that defaults to the old value of TIMER_FREQ. #define'ing TIMER_FREQ should still work and may be the best way of setting the frequency.
Calibration involves counting cycles while watching the RTC for one second. This gives values correct to within (a few ppm) + (the innaccuracy of the RTC) on my systems.
|
#
15507 |
|
01-May-1996 |
bde |
i386/machdep.c include/clock.h isa/clock.c
|
#
15392 |
|
26-Apr-1996 |
phk |
A significant debogofication of locore.s. I havn't found any actualy bugs, but it is a lot easier to navigate this twisted code now.
|
#
15379 |
|
25-Apr-1996 |
phk |
Fix cpu_fork for real.
Suggested by: bde
|
#
15304 |
|
19-Apr-1996 |
phk |
savectx returns through cpu_switch in case of the child, so it must return void just like cpu_switch. Fix prototype and usage from machdep.c
|
#
15065 |
|
05-Apr-1996 |
dg |
Switch 586/686 back to generic_bzero and #if 0'd the "optimized" code. It turns out that it actually reduces performance in real-world cases.
Noticed by: bde
|
#
15045 |
|
05-Apr-1996 |
ache |
Add wall_cmos_clock sysctl variable, needed to manage adjkerntz even for UTC cmos clocks (needed for Local Timezone FSes)
|
#
14825 |
|
26-Mar-1996 |
wollman |
Add support for Pentium and Pentium Pro performance counters. (This code is as yet untested; to come after man page is written.) This also adds inlines to cpufunc.h for the RDTSC, RDMSR, WRMSR, and RDPMC instructions. The user-mode interface is via a subdevice of mem.c; there is also a kernel-size interface which might be used to aid profiling.
|
#
14503 |
|
11-Mar-1996 |
hsu |
Change type of code argument to sendsig from unsigned to u_long to make it consistent w/ signalvar.h and kern_sig.c. Reviewed by: davidg & bde
|
#
14348 |
|
02-Mar-1996 |
jkh |
USER_LDT changes for the Willows TwinXPDK toolkit. Only tested with WINE since that's the only other USER_LDT using code that I know of. Submitted by: Gary Jennejohn <Gary.Jennejohn@munich.netsurf.de> Obtained from: {Origin of diffs may be someone else - I only rec'd them from Gary}
|
#
14331 |
|
02-Mar-1996 |
peter |
Mega-commit for Linux emulator update.. This has been stress tested under netscape-2.0 for Linux running all the Java stuff. The scrollbars are now working, at least on my machine. (whew! :-)
I'm uncomfortable with the size of this commit, but it's too inter-dependant to easily seperate out.
The main changes:
COMPAT_LINUX is *GONE*. Most of the code has been moved out of the i386 machine dependent section into the linux emulator itself. The int 0x80 syscall code was almost identical to the lcall 7,0 code and a minor tweak allows them to both be used with the same C code. All kernels can now just modload the lkm and it'll DTRT without having to rebuild the kernel first. Like IBCS2, you can statically compile it in with "options LINUX".
A pile of new syscalls implemented, including getdents(), llseek(), readv(), writev(), msync(), personality(). The Linux-ELF libraries want to use some of these.
linux_select() now obeys Linux semantics, ie: returns the time remaining of the timeout value rather than leaving it the original value.
Quite a few bugs removed, including incorrect arguments being used in syscalls.. eg: mixups between passing the sigset as an int, vs passing it as a pointer and doing a copyin(), missing return values, unhandled cases, SIOC* ioctls, etc.
The build for the code has changed. i386/conf/files now knows how to build linux_genassym and generate linux_assym.h on the fly.
Supporting changes elsewhere in the kernel:
The user-mode signal trampoline has moved from the U area to immediately below the top of the stack (below PS_STRINGS). This allows the different binary emulations to have their own signal trampoline code (which gets rid of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so that the emulator can provide the exact "struct sigcontext *" argument to the program's signal handlers.
The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which have the same values as the re-used SA_DISABLE and SA_ONSTACK which are intended for sigaction only. This enables the support of a SA_RESETHAND flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal semantics where the signal handler is reset when it's triggered.
makesyscalls.sh no longer appends the struct sysentvec on the end of the generated init_sysent.c code. It's a lot saner to have it in a seperate file rather than trying to update the structure inside the awk script. :-)
At exec time, the dozen bytes or so of signal trampoline code are copied to the top of the user's stack, rather than obtaining the trampoline code the old way by getting a clone of the parent's user area. This allows Linux and native binaries to freely exec each other without getting trampolines mixed up.
|
#
14328 |
|
02-Mar-1996 |
peter |
Add more options into the conf/options and i386/conf/options.i386 files and the #include hooks so that 'make depend' is more useful. This covers most of the options I regularly use (but not all) and some other easy ones.
|
#
14081 |
|
13-Feb-1996 |
phk |
Correct & Update the printing of CPU features. We have printed rubbish since version 1.117 when Garrett made the switch to %b. Updated to reflect Intel AP-485 (241618-004).
|
#
13646 |
|
27-Jan-1996 |
bde |
Allocate DMA bounce buffers only when requested by drivers. Only the fd and wt drivers need bounce buffers, so this normally saves 32K-1K of kernel memory.
Keep track of which DMA channels are busy. isa_dmadone() must now be called when DMA has finished or been aborted.
Panic for unallocated and too-small (required) bounce buffers.
fd.c: There will be new warnings about isa_dmadone() not being called after DMA has been aborted.
sound/dmabuf.c: isa_dmadone() needs more parameters than are available, so temporarily use a new interface isa_dmadone_nobounce() to avoid having to worry about panics for fake parameters. Untested.
|
#
13580 |
|
23-Jan-1996 |
dg |
Simplified savectx() a little and fixed a bug that caused it to return garbage in the child process rather than "1" like it is supposed to.
Reviewed by: bde
|
#
13543 |
|
21-Jan-1996 |
joerg |
Initialize the cpu_class variable. This prevents i386 machines from panicing with a privileged instruction fault early at boot time. Submitted by: rock@wurzelausix.CS.Uni-SB.DE (D. Rock)
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
13265 |
|
05-Jan-1996 |
wollman |
Convert BOUNCE_BUFFERS and BOUNCEPAGES to new option scheme.
|
#
13228 |
|
04-Jan-1996 |
wollman |
Convert DDB to new-style option.
|
#
13226 |
|
04-Jan-1996 |
wollman |
Convert SYSV IPC to new-style options. (I hope I got everything...) The LKMs will need an extra file, to come later.
|
#
13125 |
|
30-Dec-1995 |
dg |
In memory test, cast pointer as "volatile int *", not "int *" to make sure that gcc doesn't cache the value used in the test. Pointed out by Erich Boleyn <erich@uruk.org>.
|
#
13085 |
|
28-Dec-1995 |
dg |
Made bzero a function vector and added a 586/686 optimized version of bzero. Deprecated blkclr (removed it). Removed some old cruft from cpufunc.h.
The optimized bzero was submitted by Torbjorn Granlund <tege@matematik.su.se> The kernel adaption and other changes by me.
|
#
13004 |
|
24-Dec-1995 |
dg |
Fix typo in CPUCLASS.
|
#
13000 |
|
24-Dec-1995 |
dg |
Add Pentium Pro CPU detection and special handling. For now, all the optimizations we have for 586s also apply to 686s...this will be fine- tuned in the future as appropriate.
|
#
12977 |
|
22-Dec-1995 |
bde |
Increased the double fault stack size from 512 to PAGE_SIZE. This is wasteful, but better than clobbering the variables below the stack. About 300 bytes of variables were clobbered when I examined double faults using ddb. Perhaps a page that is known not to be accessed by the double fault handler could be used. Such pages are not easy to find, since the double fault handler calls panic() which calls sync() and possibly dumpsys().
|
#
12929 |
|
19-Dec-1995 |
dg |
Implemented a (sorely needed for years) double fault handler to catch stack overflows. It sure would be nice if there was an unmapped page between the PCB and the stack (and that the size of the stack was configurable!). With the way things are now, the PCB will get clobbered before the double fault handler gets control, making somewhat of a mess of things. Despite this, it is still fairly easy to poke around in the overflowed stack to figure out the cause.
|
#
12889 |
|
16-Dec-1995 |
peter |
Catch a couple more null devsw dereferences...
|
#
12827 |
|
14-Dec-1995 |
peter |
GENERIC/LINT: Remove redundant quoting on some option lines. LINT: add a couple of new/missing/undocumented options files.i386: add linux code so that you can compile a kernel with static linux emulation ("options LINUX") i386/*: use #if defined(COMPAT_LINUX) || defined(LINUX) to enable static support of linux emulation (just like "IBCS2" makes ibcs2 static)
The main thing this is going to make obvious, is that the LINUX code (when compiled from LINT) has a lot of warnings, some of which dont look too pleasant..
|
#
12813 |
|
13-Dec-1995 |
julian |
devsw tables are now arrays of POINTERS to struct [cb]devsw seems to work hre just fine though I can't check every file that changed due to limmited h/w, however I've checked enught to be petty happy withe hte code..
WARNING... struct lkm[mumble] has changed so it might be an idea to recompile any lkm related programs
|
#
12722 |
|
10-Dec-1995 |
phk |
Staticize and cleanup. remove a TON of #includes from machdep.
|
#
12701 |
|
09-Dec-1995 |
phk |
Move sysctl machdep.consdev to cons.c
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12623 |
|
04-Dec-1995 |
phk |
A major sweep over the sysctl stuff.
Move a lot of variables home to their own code (In good time before xmas :-)
Introduce the string descrition of format.
Add a couple more functions to poke into these marvels, while I try to decide what the correct interface should look like.
Next is adding vars on the fly, and sysctl looking at them too.
Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.
|
#
12533 |
|
29-Nov-1995 |
wollman |
Fix Pentium CPU rate diagnosis: - Don't print out meaningless iCOMP numbers, those are for droids. - Use a shorter wait to determine clock rate to avoid deficiencies in DELAY(). - Use a fixed-point representation with 8 bits of fraction to store the rate and rationalize the variable name. It would be possible to use even more fraction if it turns out to be worthwhile (I rather doubt it).
The question of source code arrangement remains unaddressed.
|
#
12429 |
|
20-Nov-1995 |
phk |
Mega commit for sysctl. Convert the remaining sysctl stuff to the new way of doing things. the devconf stuff is the reason for the large number of files. Cleaned up some compiler warnings while I were there.
|
#
12290 |
|
14-Nov-1995 |
phk |
Fix a couple of printfs.
|
#
12243 |
|
12-Nov-1995 |
phk |
The entire sysctl callback to read/write version. I havn't tested this as much as I'd like to, but the malloc stunt I tried for an interim for sure does worse. Now we can read and write from any kind of address-space, not only user and kernel, using callbacks. This may be over-generalization for now, but it's actually simpler.
|
#
12186 |
|
10-Nov-1995 |
phk |
convert more sysctl variables.
|
#
12078 |
|
04-Nov-1995 |
markm |
Remove the #ifdev DEVRANDOM's, as promised.
/dev/random is now a part of the kernel! you will need to make the device in /dev: sh MAKEDEV random and take a look at some test code in src/tools/test/random.
|
#
12008 |
|
02-Nov-1995 |
peter |
When the sync-on-shutdown fails to clear all buffers, this bit of code can print them out. I have seen that MFS can leave BUSY buffers, preventing a clean reboot...
|
#
11963 |
|
31-Oct-1995 |
peter |
Add a simplistic netisr register routine - I need this now for ppp-2.2.
|
#
11875 |
|
28-Oct-1995 |
markm |
Theodore Ts'po's random number gernerator for Linux, ported by me. This code will only be included in your kernel if you have 'options DEVRANDOM', but that will fall away in a couple of days. Obtained from: Theodore Ts'o, Linux
|
#
11390 |
|
10-Oct-1995 |
bde |
Include <sys/sysproto.h> so that machdep.c compiles cleanly again (the prototype for sync() moved).
KNFize and otherwise clean up printing of BIOS geometries.
Add prototypes.
Continue cleaning up new init stuff.
|
#
10782 |
|
15-Sep-1995 |
dg |
1) Killed 'BSDVM_COMPAT'. 2) Killed i386pagesperpage as it is not used by anything. 3) Fixed benign miscalculations in pmap_bootstrap(). 4) Moved allocation of ISA DMA memory to machdep.c. 5) Removed bogus vm_map_find()'s in pmap_init() - the entire range was already allocated kmem_init(). 6) Added some comments.
virual_avail is still miscalculated NKPT*NBPG too large, but in order to fix this properly requires moving the variable initialization into locore.s. Some other day.
|
#
10666 |
|
10-Sep-1995 |
bde |
Make pcvt and syscons live in the same kernel. If both are enabled, then the first one in the config has priority. They can be switched using userconfig().
i386/i386/conf.c: Initialize the shared syscons/pcvt cdevsw entry to `nx'.
Add cdevsw registration functions.
Use devsw functions of the correct type if they exist.
i386/i386/cons.c: Add renamed syscons entry points to constab.
i386/i386/cons.h: Declare the renamed syscons entry points.
i386/i386/machdep.c: Repeat console initialization after userconfig() in case the current console has become wrong. This depends on cn functions not wiring down anything important.
sys/conf.h: Declare new functions.
i386/isa/isa.[ch]: Add a function to decide which display driver has priority. Should be done better.
i386/isa/syscons.c: Rename pccn* -> sccn*.
Initialize CRTC start address in case the previous driver has moved it.
i386/isa/syscons.c, i386/isa/pcvt/* Initialize the bogusly shared variable Crtat dynamically in case the stored value was changed by the previous driver.
Initialize cdevsw table from a template.
Don't grab the console if another display driver has priority.
i386/isa/syscons.h, i386/isa/pcvt/pcvt_hdr.h: Don't externally declare now-static cdevsw functions.
i386/isa/pcvt/pcvt_hdr.h: Set the sensitive hardware flag so that pcvt doesn't always have lower priority than syscons. This also fixes the "stupid" detection of the display after filling the display with text.
i386/isa/pcvt/pcvt_out.c: Don't be confused the off-screen cursor offset 0xffff set by syscons.
kern/subr_xxx.c: Add enough nxio/nodev/null devsw functions of the correct type for syscons and pcvt.
|
#
10653 |
|
09-Sep-1995 |
dg |
Fixed init functions argument type - caddr_t -> void *. Fixed a couple of compiler warnings.
|
#
10616 |
|
08-Sep-1995 |
dg |
1) Really print 'real' memory - use Maxmem, not physmem. 2) Output K bytes instead of pages as this means something to more people. 3) Moved printf of avail memory to after vm_bounce_init() call so that bounce buffers are included in the figure. 4) Killed initcpu(); it's an unused vestige from the VAX.
|
#
10594 |
|
06-Sep-1995 |
wpaul |
Put back the "real memory =" printf() that vanished when the code to handle holes in memory was added.
|
#
10537 |
|
03-Sep-1995 |
julian |
devfs changes.. changes to allow devices that don't probe (e.g. /dev/mem) to create devfs entries this required giving 'configure' its own SYSINIT entry so we could duck in just before it with a DEVFS init and some device inits.. my devfs now looks like: ./misc ./misc/speaker ./misc/mem ./misc/kmem ./misc/null ./misc/zero ./misc/io ./misc/console ./misc/pcaudio ./misc/pcaudioctl ./disks ./disks/rfloppy ./disks/rfloppy/fd0.1440 ./disks/rfloppy/fd1.1200 ./disks/floppy ./disks/floppy/fd0.1440 ./disks/floppy/fd1.1200 also some sligt cleanups.. DEVFS needs a lot of work but I'm getting back to it..
|
#
10358 |
|
28-Aug-1995 |
julian |
Reviewed by: julian with quick glances by bruce and others Submitted by: terry (terry lambert) This is a composite of 3 patch sets submitted by terry. they are: New low-level init code that supports loadbal modules better some cleanups in the namei code to help terry in 16-bit character support some changes to the mount-root code to make it a little more modular..
NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able to test those cases..
certainly mounting root of disk still works just fine.. mfs should work but is untested. (tomorrows task)
The low level init stuff includes a total rewrite of init_main.c to make it possible for new modules to have an init phase by simply adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can be added to the kernel without editing any other files other than the 'files' file.
|
#
10126 |
|
20-Aug-1995 |
dg |
Fixed a few bugs and annoyances with boot():
1) deal with cold flag better 2) check for key input more often 3) get rid of unused variables 4) minor formatting improvements
|
#
9759 |
|
29-Jul-1995 |
bde |
Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
|
#
9744 |
|
28-Jul-1995 |
dg |
Fixed bug I introduced with the memory-size code rewrite that broke floppy DMA buffers...use avail_start not "first". Removed duplicate (and wrong) declaration of phys_avail[].
Submitted by: Bruce Evans, but fixed differently by me.
|
#
9578 |
|
19-Jul-1995 |
dg |
Rewrote memory sizing code to generally deal with holes in extended memory. This code change should allow certain Compaq machines with a 128K hole at 16MB to work.
|
#
9546 |
|
16-Jul-1995 |
phk |
Make the bootinfo structure visible from sysctl. This can be used in libdisk to guess a better bios-geometry.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
9345 |
|
28-Jun-1995 |
dg |
Killed redundant vnode_pager_umount() call. This is already done at FS unmount time.
|
#
9326 |
|
26-Jun-1995 |
bde |
Partially fix `sysctl machdep.console_device'. The fix will be complete when syscons stops mapping the console to minor MAXCONS. There is usually no corresponding device in /dev, and the correct device has minor 0.
cons.c: Initialize cn_tty properly, so that CPU_CONSDEV can work. Comment about too many variants of the console tty pointer.
machdep.c: Return device NODEV and not error EFAULT when there is no console device.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8748 |
|
25-May-1995 |
dg |
Made "NMBCLUSTERS" calculation dynamic and fixed bogus use of "NMBCLUSTERS" in machdep.c (it should use the global nmbclusters). Moved the calculation of nmbclusters into conf/param.c (same place where nmbclusters has always been assigned), and made the calculation include an extra amount based on "maxusers". NMBCLUSTERS can still be overrided in the kernel config file as always, but this change will make that generally unnecessary. This fixes the "bug" reports from people who have misconfigured kernels seeing the network hang when the mbuf cluster pool runs out.
Reviewed by: John Dyson
|
#
8481 |
|
12-May-1995 |
wollman |
The death of `options NODUMP'. Now the dump area can be dynamically configured (and unconfigured) on the fly. A sysctl(3) MIB variable is provided to inspect and modify the dump device setting.
|
#
8456 |
|
11-May-1995 |
rgrimes |
Fix -Wformat warnings from LINT kernel.
|
#
8427 |
|
10-May-1995 |
wollman |
Delete two debugging printfs that mistakenly crept in.
|
#
8426 |
|
10-May-1995 |
wollman |
Make networking domains drop-ins, through the magic of GNU ld. (Some day, there may even be LKMs.) Also, change the internal name of `unixdomain' to `localdomain' since AF_LOCAL is now the preferred name of this family. Declare netisr correctly and in the right place.
|
#
7994 |
|
22-Apr-1995 |
wpaul |
Tiny printf formatting change: if we have no cpu_vendor or cpu_id info, don't generate a newline. (Yeah, I'm picking nits, but that empty line I get on my 386 just looks dumb, okay? :)
|
#
7930 |
|
18-Apr-1995 |
rgrimes |
Reapply my fix for this: Output the CPU features line during the probe on a seperate line, for folks with lots of features the output use to wrap and look ugle.
|
#
7908 |
|
17-Apr-1995 |
phk |
Print the BIOS geometries in a human-readable format.
|
#
7814 |
|
14-Apr-1995 |
wpaul |
Hopefully I won't get flamed for this: insert a few more #if defined(I486_CPU) and #if defined (I586_CPU) thingies into identifycpu() so that we only compile in what's actually needed for a given CPU. So far as I can tell, none of my 386 machines generate a cpu_vendor code, so I made the extra vendor and feature line conditional on I486_CPU and I586_CPU. (Otherwise we print out a blank line which looks silly.)
|
#
7792 |
|
13-Apr-1995 |
wpaul |
This a subtle reminder to people that not everybody compiles their kernels with 'options I586_CPU.'
The declaration for pentium_mhz is hidden inside an #ifdef I586_CPU, but machdep.c refers to it whether I586_CPU is defined or not. This temporary hack puts the offending code inside an #ifdef I586_CPU as well so that a kernel without it will successfully compile.
I must emphasize the word 'temporary:' somebody needs to seriously beat on the identifycpu() function with an #ifdef stick so that I386_CPU, I486_CPU and I586_CPU will do the right things.
|
#
7780 |
|
12-Apr-1995 |
wollman |
Add a class field to devconf and mst drivers. For those where it was easy, drivers were also fixed to call dev_attach() during probe rather than attach (in keeping with the new design articulated in a mail message five months ago). For a few that were really easy, correct state tracking was added as well. The `fd' driver was fixed to correctly fill in the description. The CPU identify code was fixed to attach a `cpu' device. The code was also massively reordered to fill in cpu_model with somethingremotely resembling what identifycpu() prints out. A few bytes saved by using %b to format the features list rather than lots of ifs.
|
#
7644 |
|
06-Apr-1995 |
rgrimes |
Output the CPU features line during the probe on a seperate line, for folks with lots of features the output use to wrap and look ugle.
Reviewed by: phk
|
#
7103 |
|
17-Mar-1995 |
dg |
Call dev_shutdownall() just after unmounting filesystems.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
6949 |
|
07-Mar-1995 |
dg |
Increased number of buffers to 1/12 of (page_count - 1024). This makes the cache minimum closer to 10% in the usual case.
|
#
6846 |
|
02-Mar-1995 |
dg |
Use copyout to install the sigframe rather than directly writing to the user's stack.
|
#
6439 |
|
15-Feb-1995 |
dg |
Use proc0's proc struct rather than curproc's when calling sync.
|
#
6380 |
|
14-Feb-1995 |
sos |
First attempt to run linux binaries. This is only the changes needed to the generic kernel. The actual emulator is a separate LKM. (not finished yet, sorry). Submitted by: sos@freebsd.org & sef@kithrup.com
|
#
6327 |
|
12-Feb-1995 |
dg |
Carefully choose the low limit for number of buffers to acheive the best performance on small memory machines.
|
#
6308 |
|
11-Feb-1995 |
phk |
Intels App Note AP-485 applied. We will now tell a good deal more about the CPU if Intel made it.
What is a i486DX2 Write-Back Enhanced CPU ?
|
#
6301 |
|
10-Feb-1995 |
dg |
Changed extended memory test so that it's non-destructive and not a complete test (it never was "complete", which is why it was bogus). Now only a single longword is checked in each page.
|
#
6299 |
|
10-Feb-1995 |
dg |
Removed obsolete and unused vmtime() function.
|
#
5999 |
|
28-Jan-1995 |
ats |
Correct a name of one structure member in the sigaltstack structure. Now it matches the man page and also the only other commercial implementation i have found so far ( Solaris 2.x). Changed the name from ss_base to ss_sp.
|
#
5908 |
|
25-Jan-1995 |
bde |
Load the kernel symbol table in the boot loader and not at compile time. (Boot with the -D flag if you want symbols.)
Make it easier to extend `struct bootinfo' without losing either forwards or backwards compatibility.
ddb_aout.c: Get the symbol table from wherever the loader put it. Nuke db_symtab[SYMTAB_SPACE].
boot.c: Enable loading of symbols. Align them on a page boundary. Add printfs about the symbol table sizes. Pass the memory sizes to the kernel. Fix initialization of `unit' (it got moved out of the loop). Fix adding the bss size (it got moved inside an ifdef). Initialize serial port when RB_SERIAL is toggled on. Fix comments. Clean up formatting of recently added code.
io.c: Clean up formatting of recently added code.
netboot/main.c, machdep.c, wd.c: Change names of bootinfo fields.
LINT: Nuke SYMTAB_SPACE. Fix comment about DODUMP.
Makefile.i386: Nuke use of dbsym. Exclude gcc symbols from kernel unless compiling with -g. Remove unused macro. Fix comments and formatting.
genassym.c: Generate defines for some new bootinfo fields. Change names of old ones.
locore.s: Copy only the valid part of the `struct bootinfo' passed by the loader. Reserve space for symbol table, if any.
machdep.c: Check the memory sizes passed by the loader, if any. Don't use them yet.
bootinfo.h: Add a size field so that we can resolve some mismatches between the loader bootinfo and the kernel boot info. The version number is not so good for this because of historical botches and because it's harder to maintain. Add memory size and symbol table fields. Change the names of everything.
Hacks to save a few bytes:
asm.S, boot.c, boot2.S: Replace `ouraddr' by `(BOOTSEG << 4)'.
boot.c: Don't statically initialize `loadflags' to 0. Disable the "REDUNDANT" code that skips the BIOS variables. Eliminate `total'. Combine some more printfs.
boot.h, disk.c, io.c, table.c: Move all statically initialzed data to table.c.
io.c: Don't put the A20 gate bits in a variable.
|
#
5837 |
|
24-Jan-1995 |
dg |
Changed buffer allocation policy (machdep.c) Moved various pmap 'bit' test/set functions back into real functions; gcc generates better code at the expense of more of it. (pmap.c) Fixed a deadlock problem with pv entry allocations (pmap.c) Added a new, optional function 'pmap_prefault' that does clustered page table preloading (pmap.c) Changed the way that page tables are held onto (trap.c).
Submitted by: John Dyson
|
#
5675 |
|
16-Jan-1995 |
bde |
The %eflags checking introduced in the previous commit was too zealous. sigreturn() sometimes failed for ordinary returns from signal handlers. Failures of ordinary returns "can't happen" and are badly handled. "Temporary" fix: allow users to corrupt PSL_RF. This is fairly harmless. A correct fix would involve saving the old %eflags (and perhaps the old segment registers) where the user can't get at them.
|
#
5603 |
|
14-Jan-1995 |
bde |
Fix security holes in sigreturn(), ptrace() and procfs. sigreturn() attempted to check for insecure and fatal eflags and segment selectors, but missed many cases and got the IOPL check back to front. The other syscalls didn't check at all.
sys_process.c, machdep.c: Only allow PT_WRITE_U to write to the registers (ordinary and FP).
psl.h, locore.s, machdep.c: Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR. We are not supposed to assume anything about the reserved bits. Use PSL_USERCHANGE and PSL_KERNEL instead. Rename PSL_USERSET to PSL_USER.
exception.s: Define a private label for use by doreti when returning to user mode fails.
machdep.c: In syscalls, allow changing only the eflags that can be changed on 486's in user mode (no longer attempt to allow benign IOPL changes; allow changing the nasty PSL_NT; don't allow changing the i586 bits).
Don't attempt to check all the cases involving invalid selectors and %eip's. Just check for privilege violations and let the invalid things cause a trap.
procfs_machdep.c: Call the ptrace register functions to do all the work for reading and writing ordinary registers and for single stepping.
trap.c: Ignore traps caused by PSL_NT being set. Previously, users could cause a fatal trap in user mode by setting PSL_NT and executing an iret, and a fatal trap in kernel mode by setting PSL_NT and making a syscall. PSL_NT was cleared too late and not in enough modes to fix the problem.
Make all traps in user mode (except T_NMI) nonfatal.
Recover from traps caused by attempting to load invalid user registers in doreti by restarting the traps so that they appear to occur in user mode. ---
Fix bogons that I noticed while fixing the above:
psl.h: Fix some comments.
Uniformize idempotency ifdef.
exception.s, machdep.c: Remove rsvd[0-14]. rsvd0 hasn't been reserved since the 486 came out. Replace rsvd0 by `align'. rsvd[0-11] used wrong (magic non-unique) trap numbers. Replace rsvd[1-14] by rsvd.
locore.s: Enable alignment check flag on 486's and 586's.
machdep.c: Use a better type for kstack[].
Use TFREGP() to find the registers.
Reformat ptrace functions from SEF to something closer to KNF.
procfs_machdep.c: The wrong pointer to the registers got fixed as a side effect.
Implement reading and writing of FP registers.
/proc/*/*regs now work (only) for processes that are in memory.
Clean up comments.
trap.c, trap.h: Remove unused trap types.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
5413 |
|
05-Jan-1995 |
se |
Submitted by: Wolfgang Stanglmeier <wolf@dentaro.GUN.de> Reviewed by: <wollman> First hooks and defines for the ISDN driver, that soon will see the light ...
|
#
5037 |
|
11-Dec-1994 |
dg |
Removed inappropriate comment.
|
#
5036 |
|
11-Dec-1994 |
dg |
Add additional comment.
|
#
5035 |
|
11-Dec-1994 |
dg |
Fix bogus comment.
|
#
4829 |
|
26-Nov-1994 |
phk |
I made a syntax error yesterday. Submitted by: John Capo
|
#
4819 |
|
26-Nov-1994 |
phk |
Set the bootverbose if so desired. if (bootverbose) Print the geometries the bios passes to us (through the bootblocks).
|
#
4517 |
|
15-Nov-1994 |
dg |
Allow MAXMEM to be larger than the detected physical memory. This change was supposed to have already been made, but got botched somewhere. Don't clobber the last page of memory (where the message buffer is). Some BIOS don't gratuitously wipe it out on reboot.
|
#
4501 |
|
15-Nov-1994 |
bde |
Make gdt_segs[] public again for APM.
Make ldt[] public again and restore currentldt and _default_ldt for USER_LDT.
|
#
4476 |
|
14-Nov-1994 |
bde |
Oops, the previous commit got the diff for the log message instead of the following.
Move declarations to and from <machine/segments.h>. Make segment stuff static if possible.
Remove unused (although initialized) global variables _default_ldt, currentldt, _gsel_tss (rename the latter to the auto variable gtel_tss).
Use "correct" and consistent types for interrupt handlers.
Remove a mailing address from the code.
Fix type mismatches found by adding prototypes.
|
#
4475 |
|
14-Nov-1994 |
bde |
(Bogus several hundred line diff for a log message deleted. See rev 1.91 for the intended log message. -DG)
|
#
4463 |
|
14-Nov-1994 |
bde |
Undo a previous change. <sys/disklabel.h> was broken, not these files.
|
#
4221 |
|
07-Nov-1994 |
phk |
Added a kernel variable, "dodump" defaulting to zero, which disables dumps. Somebody should make a mib variable for it. Just now it is pointless to dump the kernel, since we have nothing which can read the dump. Furthermore is should never be the default to dump. options DODUMP will enable dumps.
|
#
4201 |
|
06-Nov-1994 |
dg |
Do a better job at preparing registers for the new process in setregs() by setting them all to a known state.
|
#
4193 |
|
05-Nov-1994 |
bde |
Nuke the losing version of microtime. The assembler version now works for all reasonable HZ's. HZ > 1000 doesn't work because of sloppy conversions in hzto() (division by (tick / 1000) == 0). This was fixed in 1.1.5.
Eliminate some extern declarations by including the appropriate header files that now contain appropriate declarations.
|
#
4116 |
|
03-Nov-1994 |
jkh |
Unconditionalize USERCONFIG. Uh, thanks, David.
|
#
4038 |
|
01-Nov-1994 |
ache |
Implement CPU_ADJKERNTZ in different way: call resettodr() on writting this variable. adjkerntz pgm changes will follow.
|
#
3962 |
|
28-Oct-1994 |
jkh |
From: fredriks@mcs.com (Lars Fredriksen) ... It turns out that these files do not include <sys/dkbad.h> before <sys/disklabel.h>. Submitted by: fredriks
|
#
3940 |
|
27-Oct-1994 |
jkh |
Julian Elischer's disklabel fixes.
|
#
3907 |
|
26-Oct-1994 |
jkh |
Invoke userconfig() if kernel compiled with options USERCONFIG and -c flag used.
|
#
3846 |
|
25-Oct-1994 |
dg |
Allow MAXMEM kernel option to indicate more memory than is detected; it previously could only be used to limit the amount of memory.
|
#
3844 |
|
25-Oct-1994 |
dg |
Restricted maximum bufpages to 1500; this is required for machines >64MB of memory to work without running out of kernel VM (and increasing it to even more than it is now (96MB) is out of the question. Changed bufpages calculation to allocation a little less bufer cache (16% of mem-2MB instead of 20%); this is simply a better figure for most systems.
|
#
3728 |
|
19-Oct-1994 |
phk |
Peter Dufaults comconsole changes.
Submitted by: Peter Dufault
|
#
3703 |
|
18-Oct-1994 |
wollman |
Implement disk_externalize().
|
#
3682 |
|
18-Oct-1994 |
ache |
Remove CPU_COLORDISP, GIO_COLOR now exists
|
#
3661 |
|
17-Oct-1994 |
ache |
Ifdef color_display by NSC, pointed by Rod
|
#
3627 |
|
15-Oct-1994 |
ache |
ADd CPU_COLORDISP sysctl to handle console display type
|
#
3502 |
|
10-Oct-1994 |
phk |
minaddr #ifdef lost in previous commit. Sorry.
|
#
3489 |
|
09-Oct-1994 |
phk |
locore.s: Made the APM stuff depend on NAPM > 0 rather than a separate "APM" macro. machdep.c: Made the APM-descriptors unconditional. Bruce: if these still conflict with your debugger, please put in a reservation for your debugger. These three desc. can be anywhere, as long as they are contiguous, so just move them as needed.
|
#
3451 |
|
09-Oct-1994 |
dg |
Got rid of map.h. It's a leftover from the rmap code, and we use rlists. Changed swapmap into swaplist.
|
#
3367 |
|
04-Oct-1994 |
ache |
Add code to handle CPU_DISRTCSET
|
#
3306 |
|
02-Oct-1994 |
phk |
Unused variables, except one with a omnious comment.
|
#
3284 |
|
01-Oct-1994 |
rgrimes |
1. Remove all references to cyloffset, it has been unused for some time.
2. New detection code so we know what boot code called us.
3. Remove old DISKLESS support code and halt if we are called by that boot code as it will NOT work with the new nfs_diskless structure.
This is really in preperation for new boot code and new diskless support.
Reviewed by: davidg
|
#
3258 |
|
01-Oct-1994 |
dg |
Laptop Advanced Power Management support by HOSOKAWA Tatsumi.
Submitted by: HOSOKAWA Tatsumi
|
#
3047 |
|
24-Sep-1994 |
dg |
Nuked splnet before sync. Not only is this unnecessary, but it appears to cause problems by making it impossible to sync NFS related buffers when rebooting.
|
#
2822 |
|
16-Sep-1994 |
phk |
Made the kernel compile even without "ether".
|
#
2818 |
|
15-Sep-1994 |
ache |
CPU_ADJKERNTZ added to cpu_sysctl
|
#
2789 |
|
15-Sep-1994 |
dg |
Brought over from 1.1.5:
Fix from Bruce Evans. There were missing sets of parantheses:
1. The checks for the standard data selectors were botched, so %ss == 0 and probably %cs == 0 were allowed. A fix is enclosed. The checks for the standard selectors could be omitted without losing anything since the standard selectors pass the valid_ldt_sel() tests.
|
#
2772 |
|
14-Sep-1994 |
wollman |
Beginnings of support for loadable protocol domains. In particular, don't hard-code netisr values in icu.s, but rather, use an array of function pointers and set them all up in machdep.c for statically-linked protocol families. (This will eventually be done differently.)
|
#
2495 |
|
04-Sep-1994 |
pst |
Detect if we're running on a Cyrix 486DLC and enable automatic cache negation whenever we access memory between 640k and 1M.
Original code from NetBSD 1.0-BETA. The exact origins are unclear but Theo de Raadt, Charles, and Michael V. may have contributed to it.
Submitted by: pst
|
#
2455 |
|
02-Sep-1994 |
dg |
Removed all vestiges of tlbflush(). Replaced them with calls to pmap_update(). Made pmap_update an inline assembly function.
|
#
2426 |
|
31-Aug-1994 |
dg |
Fixed bug that surfaced with last commit for NOBOUNCE -> BOUNCE_BUFFERS by adding appropriate #ifdefs and changing some variables to externs (as they should have always been).
|
#
2422 |
|
31-Aug-1994 |
dg |
Rather than exclude bounce buffers support with NOBOUNCE, include it with BOUNCE_BUFFERS. This is more intuitive, and is better for future multiplatform support. Added BOUNCE_BUFFERS option to the GENERIC and LINT kernel config files.
|
#
2320 |
|
27-Aug-1994 |
dg |
1) Changed ddb into a option rather than a pseudo-device (use options DDB in your kernel config now). 2) Added ps ddb function from 1.1.5. Cleaned it up a bit and moved into its own file. 3) Added \r handing in db_printf. 4) Added missing memory usage stats to statclock(). 5) Added dummy function to pseudo_set so it will be emitted if there are no other pseudo declarations.
|
#
2254 |
|
24-Aug-1994 |
sos |
Changes preparing for iBCS2 support
Reviewed by: Submitted by:
|
#
2152 |
|
20-Aug-1994 |
dg |
Implemented filesystem clean bit via:
machdep.c: Changed printf's a little and call vfs_unmountall() if the sync was successful.
cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c: Allow dismount of root FS. It is now disallowed at a higher level.
vfs_conf.c: Removed unused rootfs global.
vfs_subr.c: Added new routines vfs_unmountall and vfs_unmountroot. Filesystems are now dismounted if the machine is properly rebooted.
ffs_vfsops.c: Toggle clean bit at the appropriate places. Print warning if an unclean FS is mounted.
ffs_vfsops.c, lfs_vfsops.c: Fix bug in selecting proper flags for VOP_CLOSE().
vfs_syscalls.c: Disallow dismounting root FS via umount syscall.
|
#
2124 |
|
19-Aug-1994 |
dg |
Terry Lambert's loadable kernel module support w/improvements from the NetBSD group.
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
2059 |
|
13-Aug-1994 |
dg |
Made the kernel compile cleanly with gcc 2.6.0. Thanks go to Bruce Evans for suggesting a method to detect various versions of gcc.
|
#
2056 |
|
13-Aug-1994 |
wollman |
Change all #includes to follow the current Berkeley style. Some of these ``changes'' are actually not changes at all, but CVS sometimes has trouble telling the difference.
This also includes support for second-directory compiles. This is not quite complete yet, as `config' doesn't yet do the right thing. You can still make it work trivially, however, by doing the following:
rm /sys/compile mkdir /usr/obj/sys/compile ln -s M-. /sys/compile cd /sys/i386/conf config MYKERNEL cd ../../compile/MYKERNEL ln -s /sys @ rm machine ln -s @/i386/include machine make depend make
|
#
2014 |
|
10-Aug-1994 |
wollman |
Tell Pentium users their CPU speed. (More changes to make use of this to come later.)
|
#
1999 |
|
10-Aug-1994 |
wollman |
Some programs (like GNU configure programs) depend on the output of `uname -s' to be something reasonable (traditionally, `i386') rather than `PC-Class'. Make it so.
|
#
1998 |
|
10-Aug-1994 |
wollman |
Add back in CPU detection copde from 1.1.5. As an added bonus, the hw.model MIB variable is now declared correctly.
|
#
1887 |
|
06-Aug-1994 |
dg |
Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition.
Submitted by: John Dyson
|
#
1829 |
|
04-Aug-1994 |
dg |
Nuked #if 0'd _insque and _remque routines - they are now inlined in cpufunc.h.
|
#
1825 |
|
03-Aug-1994 |
dg |
Merged in post-1.1.5 work done by myself and John Dyson. This includes:
me: 1) TLB flush optimization that effectively eliminates half of all of the TLB flushes. This works by only flushing the TLB when a page is "present" in memory (i.e. the valid bit is set in the page table entry). See section 5.3.5 of the Intel 386 Programmer's Reference Manual. 2) The handling of "CMAP" has been improved to catch attempts at multiple simultaneous use.
John: 1) Added pmap_qenter/pmap_qremove functions for fast mapping of pages into the kernel. This is for future optimizations and support for the upcoming merged VM/buffer cache.
Reviewed by: John Dyson
|
#
1678 |
|
04-Jun-1994 |
dg |
Removed extra (bogus) declaration of Xrsvd14 that was confusing me.
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1321 |
|
02-Apr-1994 |
dg |
New interrupt code from Bruce Evans. In additional to Bruce's attached list of changes, I've made the following additional changes:
1) i386/include/ipl.h renamed to spl.h as the name conflicts with the file of the same name in i386/isa/ipl.h. 2) changed all use of *mask (i.e. netmask, biomask, ttymask, etc) to *_imask (net_imask, etc). 3) changed vestige of splnet use in if_is to splimp. 4) got rid of "impmask" completely (Bruce had gotten rid of netmask), and are now using net_imask instead. 5) dozens of minor cruft to glue in Bruce's changes.
These require changes I made to config(8) as well, and thus it must be rebuilt.
-DG
from Bruce Evans:
sio: o No diff is supplied. Remove the define of setsofttty(). I hope that is enough.
*.s: o i386/isa/debug.h no longer exists. The event counters became too much trouble to maintain. All function call entry and exception entry counters can be recovered by using profiling kernel (the new profiling supports all entry points; however, it is too slow to leave enabled all the time; it also). Only BDBTRAP() from debug.h is now used. That is moved to exception.s. It might be worth preserving SHOW_BITS() and calling it from _mcount() (if enabled). o T_ASTFLT is now only set just before calling trap(). o All exception handlers set SWI_AST_MASK in cpl as soon as possible after entry and arrange for _doreti to restore it atomically with exiting. It is not possible to set it atomically with entering the kernel, so it must be checked against the user mode bits in the trap frame before committing to using it. There is no place to store the old value of cpl for syscalls or traps, so there are some complications restoring it.
Profiling stuff (mostly in *.s): o Changes to kern/subr_mcount.c, gcc and gprof are not supplied yet. o All interesting labels `foo' are renamed `_foo' and all uninteresting labels `_bar' are renamed `bar'. A small change to gprof allows ignoring labels not starting with underscores. o MCOUNT_LABEL() is to provide names for counters for times spent in exception handlers. o FAKE_MCOUNT() is a version of MCOUNT() suitable for exception handlers. Its arg is the pc where the exception occurred. The new mcount() pretends that this was a call from that pc to a suitable MCOUNT_LABEL(). o MEXITCOUNT is to turn off any timer started by MCOUNT().
/usr/src/sys/i386/i386/exception.s: o The non-BDB BPTTRAP() macros were doing a sti even when interrupts were disabled when the trap occurred. The sti (fixed) sti is actually a no-op unless you have my changes to machdep.c that make the debugger trap gates interrupt gates, but fixing that would make the ifdefs messier. ddb seems to be unharmed by both interrupts always disabled and always enabled (I had the branch in the fix back to front for some time :-(). o There is no known pushal bug. o tf_err can be left as garbage for syscalls.
/usr/src/sys/i386/i386/locore.s: o Fix and update BDE_DEBUGGER support. o ENTRY(btext) before initialization was dangerous. o Warm boot shot was longer than intended.
/usr/src/sys/i386/i386/machdep.c: o DON'T APPLY ALL OF THIS DIFF. It's what I'm using, but may require other changes. Use the following: o Remove aston() and setsoftclock(). Maybe use the following: o No netisr.h. o Spelling fix. o Delay to read the Rebooting message. o Fix for vm system unmapping a reduced area of memory after bounds_check_with_label() reduces the size of a physical i/o for a partition boundary. A similar fix is required in kern_physio.c. o Correct use of __CONCAT. It never worked here for non- ANSI cpp's. Is it time to drop support for non-ANSI? o gdt_segs init. 0xffffffffUL is bogus because ssd_limit is not 32 bits. The replacement may have the same value :-), but is more natural. o physmem was one page too low. Confusing variable names. Don't use the following: o Better numbers of buffers. Each 8K page requires up to 16 buffer headers. On my system, this results in 5576 buffers containing [up to] 2854912 bytes of memory. The usual allocation of about 384 buffers only holds 192K of disk if you use it on an fs with a block size of 512. o gdt changes for bdb. o *TGT -> *IDT changes for bdb. o #ifdefed changes for bdb.
/usr/src/sys/i386/i386/microtime.s: o Use the correct asm macros. I think asm.h was copied from Mach just for microtime and isn't used now. It certainly doesn't belong in <sys>. Various macros are also duplicated in sys/i386/boot.h and libc/i386/*.h. o Don't switch to and from the IRR; it is guaranteed to be selected (default after ICU init and explicitly selected in isa.c too, and never changed until the old microtime clobbered it).
/usr/src/sys/i386/i386/support.s: o Non-essential changes (none related to spls or profiling). o Removed slow loads of %gs again. The LDT support may require not relying on %gs, but loading it is not the way to fix it! Some places (copyin ...) forgot to load it. Loading it clobbers the user %gs. trap() still loads it after certain types of faults so that fuword() etc can rely on it without loading it explicitly. Exception handlers don't restore it. If we want to preserve the user %gs, then the fastest method is to not touch it except for context switches. Comparing with VM_MAXUSER_ADDRESS and branching takes only 2 or 4 cycles on a 486, while loading %gs takes 9 cycles and using it takes another. o Fixed a signed branch to unsigned.
/usr/src/sys/i386/i386/swtch.s: o Move spl0() outside of idle loop. o Remove cli/sti from idle loop. sw1 does a cli, and in the unlikely event of an interrupt occurring and whichqs becoming zero, sw1 will just jump back to _idle. o There's no spl0() function in asm any more, so use splz(). o swtch() doesn't need to be superaligned, at least with the new mcounting. o Fixed a signed branch to unsigned. o Removed astoff().
/usr/src/sys/i386/i386/trap.c: o The decentralized extern decls were inconsistent, of course. o Fixed typo MATH_EMULTATE in comments. */ o Removed unused variables. o Old netmask is now impmask; print it instead. Perhaps we should print some of the new masks. o BTW, trap() should not print anything for normal debugger traps.
/usr/src/sys/i386/include/asmacros.h: o DON'T APPLY ALL OF THIS DIFF. Just use some of the null macros as necessary.
/usr/src/sys/i386/include/cpu.h: o CLKF_BASEPRI() changes since cpl == SWI_AST_MASK is now normal while the kernel is running. o Don't use var++ to set boolean variables. It fails after a mere 4G times :-) and is slower than storing a constant on [3-4]86s.
/usr/src/sys/i386/include/cpufunc.h: o DON'T APPLY ALL OF THIS DIFF. You need mainly the include of <machine/ipl.h>. Unfortunately, <machine/ipl.h> is needed by almost everything for the inlines.
/usr/src/sys/i386/include/ipl.h: o New file. Defines spl inlines and SWI macros and declares most variables related to hard and soft interrupt masks.
/usr/src/sys/i386/isa/icu.h: o Moved definitions to <machine/ipl.h>
/usr/src/sys/i386/isa/icu.s: o Software interrupts (SWIs) and delayed hardware interrupts (HWIs) are now handled uniformally, and dispatching them from splx() is more like dispatching them from _doreti. The dispatcher is essentially *(handler[ffs(ipending & ~cpl)](). o More care (not quite enough) is taken to avoid unbounded nesting of interrupts. o The interface to softclock() is changed so that a trap frame is not required. o Fast interrupt handlers are now handled more uniformally. Configuration is still too early (new handlers would require bits in <machine/ipl.h> and functions to vector.s). o splnnn() and splx() are no longer here; they are inline functions (could be macros for other compilers). splz() is the nontrivial part of the old splx().
/usr/src/sys/i386/isa/ipl.h o New file. Supposed to have only bus-dependent stuff. Perhaps the h/w masks should be declared here.
/usr/src/sys/i386/isa/isa.c: o DON'T APPLY ALL OF THIS DIFF. You need only things involving *mask and *MASK and comments about them. netmask is now a pure software mask. It works like the softclock mask.
/usr/src/sys/i386/isa/vector.s: o Reorganize AUTO_EOI* macros. o Option FAST_INTR_HANDLER_USERS_ES for people who don't trust fastintr handlers. o fastintr handlers need to metamorphose into ordinary interrupt handlers if their SWI bit has become set. Previously, sio had unintended latency for handling output completions and input of SLIP framing characters because this was not done.
/usr/src/sys/net/netisr.h: o The machine-dependent stuff is now imported from <machine/ipl.h>.
/usr/src/sys/sys/systm.h o DON'T APPLY ALL OF THIS DIFF. You need mainly the different splx() prototype. The spl*() prototypes are duplicated as inlines in <machine/ipl.h> but they need to be duplicated here in case there are no inlines. I sent systm.h and cpufunc.h to Garrett. We agree that spl0 should be replaced by splnone and not the other way around like I've done.
/usr/src/sys/kern/kern_clock.c o splsoftclock() now lowers cpl so the direct call to softclock() works as intended. o softclock() interface changed to avoid passing the whole frame (some machines may need another change for profile_tick()). o profiling renamed _profiling to avoid ANSI namespace pollution. (I had to improve the mcount() interface and may as well fix it.) The GUPROF variant doesn't actually reference profiling here, but the 'U' in GUPROF should mean to select the microtimer mcount() and not change the interface.
|
#
1313 |
|
30-Mar-1994 |
dg |
Eliminated the "physstrat" wart and merged it into kern_physio.c. This patch also fixes a bug which causes a kernel VM leak.
|
#
1298 |
|
23-Mar-1994 |
dg |
Bounce buffers. From John Dyson with help from me.
|
#
1281 |
|
19-Mar-1994 |
wollman |
Added cpu_model and machine variables.
|
#
1247 |
|
07-Mar-1994 |
dg |
1) enhanced in_cksum from Bruce Evans. 2) minor comment change in machdep.c 3) enhanced bzero from John Dyson (twice as fast on a 486DX/33)
|
#
1208 |
|
23-Feb-1994 |
hsu |
validate sigcontext before restoring it
|
#
1129 |
|
08-Feb-1994 |
dg |
From: Dave Matthews <dave@prlng.co.uk>
Description: The integer overflow instruction (into) and the interrupt instruction with value 4 (int #4) both give rise to SIGBUS signals rather than SIGFPE. The problem is that overflow is a trap not a fault (unlike the BOUND instruction).
|
#
1127 |
|
08-Feb-1994 |
dg |
Fixed bugs in stack grow code, and moved it back into a seperate function like it was originally. Also added back call to "grow" in sendsig now that this routine actually works.
|
#
1116 |
|
07-Feb-1994 |
dg |
Fixed calculation of physmem when the special MAXMEM kernel config overide is used. This bug caused the buffer cache to be WAY too big when memory was being restricted - resulting in hangs and other out of memory problems.
|
#
1066 |
|
01-Feb-1994 |
dg |
Bug fix from previous WINE commit. From Jeffrey Hsu.
|
#
1055 |
|
31-Jan-1994 |
dg |
Added four pattern memory test routine that is done at startup.
|
#
1051 |
|
31-Jan-1994 |
dg |
WINE/user LDT support from John Brezak, ported to FreeBSD by Jeffrey Hsu <hsu@soda.berkeley.edu>.
|
#
1045 |
|
31-Jan-1994 |
dg |
VM system performance improvements from John Dyson and myself. The following is a summary:
1) increased object cache back up to a more reasonable value. 2) removed old & bogus cruft from machdep.c (clearseg, copyseg, physcopyseg, etc). 3) inlined many functions in pmap.c 4) changed "load_cr3(rcr3())" into tlbflush() and made tlbflush inline assembly. 5) changed the way that modified pages are tracked - now vm_page struct is kept updated directly - no more scanning page tables. 6) removed lots of unnecessary spl's 7) removed old unused functions from pmap.c 8) removed all use of page_size, page_shift, page_mask variables - replaced with PAGE_ constants. 9) moved trunc/round_page, atop, ptoa, out of vm_param.h and into i386/ include/param.h, and optimized them. 10) numerous changes to sys/vm/ swap_pager, vnode_pager, pageout, fault code to improve performance. LRU algorithm modified to be more effective, read ahead/behind values tuned for better performance, etc, etc...
|
#
990 |
|
21-Jan-1994 |
dg |
System V IPC code from Danny Boulet, chewed on a bit by the NetBSD group and then some more by Jeffrey Hsu (who provided this port for FreeBSD).
|
#
989 |
|
20-Jan-1994 |
dg |
Pointed out by Wolfgang Solfrank: Correct parameters of sync
|
#
988 |
|
20-Jan-1994 |
dg |
Removed some more old unused code/comments. Added hack to "fix" the problem with some chipsets (UMC) remapping the 'hole' memory even when you've got 16MB. People were led to believe that since there was only 16MB of memory in the machine, that they were okay wrt the ISA DMA limit. This hack simply causes the extra memory to be ignored if it appears around the 16MB limit.
|
#
987 |
|
20-Jan-1994 |
dg |
Improved algorithm that calculates the pages in the base memory - If the BIOS says that the amount is *between* 0-640K, believe it. Cleaned up the comments a bit, removed some old cruff, etc.
|
#
974 |
|
14-Jan-1994 |
dg |
"New" VM system from John Dyson & myself. For a run-down of the major changes, see the log of any effected file in the sys/vm directory (swap_pager.c for instance).
|
#
924 |
|
03-Jan-1994 |
dg |
Convert syscall to trapframe. Based on work done by John Brezak.
|
#
911 |
|
22-Dec-1993 |
dg |
Raised minimum buffer cache from 128k to 256k.
|
#
879 |
|
18-Dec-1993 |
wollman |
Make everything compile with -Wtraditional. Make it easier to distribute a binary link-kit. Make all non-optional options (pagers, procfs) standard, and update LINT to reflect new symtab requirements.
NB: -Wtraditional will henceforth be forgotten. This editing pass was primarily intended to detect any constructions where the old code might have been relying on traditional C semantics or syntax. These were all fixed, and the result of fixing some of them means that -Wall is now a realistic possibility within a few weeks.
|
#
849 |
|
12-Dec-1993 |
dg |
1) Added proc file system from Paul Kranenburg with changes from John Dyson to make it reliably work under FreeBSD. 2) Added and enabled PROCFS in the GENERICxx and LINT kernels. 3) New execve() from me. Still work to be done here, but this version works well and is needed before other changes can be made. For a description of the design behind this, see freebsd-arch or ask me. 4) Rewrote stack fault code; made user stack VM grow as needed rather than all up front; improves performance a little and reduces process memory requirements. 5) Incorporated fix from Gene Stark to fault/wire a user page table page to fix a problem in copyout. This is a temporary fix and is not appropriate for pageable page tables. For a description of the problem, see Gene's post to the freebsd-hackers mailing list. 6) Tighten up vm_page struct to reduce memory requirements for it. ifdef pager page lock code as it's not being used currently. 7) Introduced new element to vmspace struct - vm_minsaddr; initial (minimum) stack address. Compliment to vm_maxsaddr. 8) Added a panic if the allocation for process u-pages fails. 9) Improve performance and accuracy of kernel profiling by putting in a little inline assembly instead of spl(). 10) Made serial console with sio driver work. Still has problems with serial input, but is almost useable. 11) Added -Bstatic to SYSTEM_LD in Makefile.i386 so that kernels will build properly with the new ld.
|
#
827 |
|
03-Dec-1993 |
alm |
From: Jeffrey Hsu <hsu@soda.berkeley.edu>
The following patch adds the addr argument to signal handlers.
The kernel with the patch is no more and no less in compliance or in violation of POSIX and ANSI C than the kernel before the patch.
The added functionality this addr argument provides is quite useful. It enables an entire class of algorithms which use mprotect to trace memory references. Beside garbage collectors, I have heard of this technique being applied to debuggers and profilers. The only benchmarking I've performed is using akcl to compile maxima: without the kernel patch, it takes 7 hours to compile maxima, while with stratified garbage collection, it only takes 50 minutes.
Basically, I can't think of a reason not to add the addr argument and there is a compelling need for it.
If you find the patch acceptable, please let me know so I can send my FreeBSD akcl config files to wfs for inclusion in the core akcl release. The old 386BSD config files there won't work on either NetBSD or FreeBSD.
|
#
798 |
|
24-Nov-1993 |
wollman |
Make the LINT kernel compile with -W -Wreturn-type -Wcomment -Werror, and add same (sans -Werror) to Makefile for future compilations.
|
#
778 |
|
17-Nov-1993 |
wollman |
Fixed comments that start within a comment, so code compiles cleanly with -Wcomment.
|
#
774 |
|
16-Nov-1993 |
dg |
new process tracing code from Sean Eric Fagen (sef@kithrup.com). ...also, fixed up the syscall args to make GCC happy.
|
#
757 |
|
13-Nov-1993 |
dg |
First steps in rewriting locore.s, and making info useful when the machine panics.
i386/i386/locore.s: 1) got rid of most .set directives that were being used like #define's, and replaced them with appropriate #define's in the appropriate header files (accessed via genassym). 2) added comments to header inclusions and global definitions, and global variables 3) replaced some hardcoded constants with cpp defines (such as PDESIZE and others) 4) aligned all comments to the same column to make them easier to read 5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to /sys/i386/include/asmacros.h 6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code 7) added new global '_KERNend' to store last location+1 of kernel 8) cleaned up zeroing of bss so that only bss is zeroed 9) fix zeroing of page tables so that it really does zero them all - not just if they follow the bss. 10) rewrote page table initialization code so that 1) works correctly and 2) write protects the kernel text by default 11) properly initialize the kernel page directory, upages, p0stack PT, and page tables. The previous scheme was more than a bit screwy. 12) change allocation of virtual area of IO hole so that it is fixed at KERNBASE + 0xa0000. The previous scheme put it right after the kernel page tables and then later expected it to be at KERNBASE +0xa0000 13) change multiple bogus settings of user read/write of various areas of kernel VM - including the IO hole; we should never be accessing the IO hole in user mode through the kernel page tables 14) split kernel support routines such as bcopy, bzero, copyin, copyout, etc. into a seperate file 'support.s' 15) split swtch and related routines into a seperate 'swtch.s' 16) split routines related to traps, syscalls, and interrupts into a seperate file 'exception.s' 17) remove some unused global variables from locore that got inserted by Garrett when he pulled them out of some .h files.
i386/isa/icu.s: 1) clean up global variable declarations 2) move in declaration of astpending and netisr
i386/i386/pmap.c: 1) fix calculation of virtual_avail. It previously was calculated to be right in the middle of the kernel page tables - not a good place to start allocating kernel VM. 2) properly allocate kernel page dir/tables etc out of kernel map - previously only took out 2 pages.
i386/i386/machdep.c: 1) modify boot() to print a warning that the system will reboot in PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user abort with a key on the console. The machine will wait for ever if a key is typed before the reboot. The default is 15 seconds, but can be set to 0 to mean don't wait at all, -1 to mean wait forever, or any positive value to wait for that many seconds. 2) print "Rebooting..." just before doing it.
kern/subr_prf.c: 1) remove PANICWAIT as it is deprecated by the change to machdep.c
i386/i386/trap.c: 1) add table of trap type strings and use it to print a real trap/ panic message rather than just a number. Lot's of work to be done here, but this is the first step. Symbolic traceback is in the TODO.
i386/i386/Makefile.i386: 1) add support in to build support.s, exception.s and swtch.s
...and various changes to various header files to make all of the above happen.
|
#
724 |
|
07-Nov-1993 |
wollman |
Get rid of WFJ's use of sleep() for more user-friendly tsleep().
|
#
683 |
|
29-Oct-1993 |
dg |
Whoops, the algorithm I last used was messed up - I left off parans, and should have used PGSHIFT instead of PAGE_SHIFT.
|
#
682 |
|
29-Oct-1993 |
dg |
Change filesystem buffer cache size calculation to be less for 4MB machines (now 20% of all memory after the first 3MB). This is necessary in order for 4MB machine to be able to rebuild the entire source tree and not run out of physical memory because of fixed memory requirements of processes and kernel VM.
|
#
608 |
|
15-Oct-1993 |
rgrimes |
genassym.c: Remove NKMEMCLUSTERS, it is no longer define or used.
locores.s: Fix comment on PTDpde and APTDpde to be pde instead of pte Add new equation for calculating location of Sysmap Remove Bill's old #ifdef garbage for counting up memory, that stuff will never be made to work and was just cluttering up the file.
Add code that places the PTD, page table pages, and kernel stack below the 640k ISA hole if there is room for it, otherwise put this stuff all at 1MB. This fixes the 28K bogusity in the boot blocks, that can now go away!
Fix the caclulation of where first is to be dependent on NKPDE so that we can skip over the above mentioned areas. The 28K thing is now 44K in size due to the increase in kernel virtual memory space, but since we no longer have to worry about that this is no big deal.
Use if NNPX > 0 instead of ifdef NPX for floating point code.
machdep.c Change the calculation of for the buffer cache to be 20% of all memory above 2MB and add back the upper limit of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL of the kernel memory space on large memory machines, note that this will not even come into effect unless you have more than 32MB. The current buffer cache limit is 6.7MB due to this caclulation.
It seems that we where erroniously allocating bufpages pages for buffer_map. buffer_map is UNUSED in this implementation of the buffer cache, but since the map is referenced in several if statements a quick fix was to simply allocate 1 vm page (but no real memory) to it.
pmap.h Remove rcsid, don't want them in the kernel files!
Removed some cruft inside an #ifdef DEBUGx that caused compiler errors if you where compiling this for debug.
Use the #defines for PD_SHIFT and PG_SHIFT in place of constants.
trap.c: Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Remove a now completly invalid check for a maximum virtual address, the virtual address now ends at 0xFFFFFFFF so there is no more MAX!! (Thanks David, I completly missed that one!)
vm_machdep.c Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Replace several 0xFE00000 constants with KERNBASE
|
#
604 |
|
14-Oct-1993 |
rgrimes |
>From David Greenman
Bruce Evans had limited the kernel virtual address space to not include the last 4MB since it was not being used. Other changes are being made that will reloate the Alternate Page Directory Table (APDT) into this area so the limit is being fixed to be the last virtual address. (Infact with this patch you can now do that relocation)
|
#
569 |
|
10-Oct-1993 |
rgrimes |
Added a compile time #error so that if the user does not specify on of the proper I_X86CPU in the config file the following error will occur while building the kernel: (had to line wrap the error for this message)
../../i386/i386/machdep.c:343: #error This kernel is not configured for one \ of the supported CPUs
|
#
556 |
|
08-Oct-1993 |
rgrimes |
All: removed patch kit headers and sccsids, add $Id$. This is a general clean up and reallignment with NetBSD-current where possible.
genassym.c: removed extranious include of reg.h removed old FP_* defines that have been ifdefed out since the patch kit removed PCB_SIGC that is not referenced anywhere add trapframe and sigframe defines add KERNBASE define for use in locore.s
locore.s: include npx.h and use NNPX for turning on and off FPU include machine/cputypes.h for the types of cpu (used in cpu_identify) change SYSPDREND to be one higher, this is really the base of the next area, and will be changing again next time I revise the file Reverse the NOP defines, you now get slow NOP's by default, this may be what is casuing us trouble with some systems. If you want the NOPS to be null you now need to have options DUMMY_NOPS. Now get esym from the boot blocks which don't pass it yet, and it is not used, but this will be changing. Move the bit_colors stuff to be in with the rest of Bruces SHOW_A_LOT things for debugging. Added NetBSD's CPU type probe code, we now know what type of CPU we are running on. Adjust kernel pde calcuation to correct for change in SYSPDREND, no longer need the +1.
machdep.c include npx.h and use NNPX for turning on and off FPU include isa.h, map.h(new file), exec.h in preperation for changes that are still in process. Add some of the code for MACHINE_NONCONTIG that will alow us to better map around the BIOS memory area. Now print the version, cpu id, real memory and availiable memory during boot. Correct the calculation of bufpages, the code was mixing pages and bytes, it now does the right things. Removed Bill's hack for limiting the erronous calculation. add the identifycpu print out code from NetBSD. remove the definition of the sigframe struct, it belongs in frame.h put in printf's about syncing disks on a halt/reboot. Change the halted message to be a little easier reading. Clean up of the dump messages, makes the source and the output much more readable. Change 0,0 in several places to have spaces after the commas.
|
#
549 |
|
08-Oct-1993 |
rgrimes |
Removed patch kit headers, and rcsid, add $Id$, relocate Terry Lamberts copyright to match the location that it is in NetBSD.
Remove the __main() {} dummy function, it belongs in kern/init_main.c
|
#
504 |
|
24-Sep-1993 |
rgrimes |
>From: rich@id.slip.bcm.tmc.edu.cdrom.com (Rich Murphey) Date: Sun, 12 Sep 1993 18:19:05 -0500 This will allow you to compile and run a freebsd kernel with shared memory support. I haven't tested the shm*() calls yet.
You run out of page table descriptors if you specify 4Mb of sharable memory (SHMMAXPGS=1024). I don't know what the limit is, but SHMMAXPGS=64 works. Rich
|
#
259 |
|
09-Aug-1993 |
rgrimes |
From guido@gvr.win.tue.nl Sat Aug 7 06:58:04 1993
I posted some patches on the 386bsd_patchkit list to prohibit io access. Because of a noninitialised filed in the tss, this was possible. It is included below as the patch to machdep.c However, when you do this *necessary* fix (security), it will be impossible form within user space to do io.
therefor, I included another fix: when you open /dev/io, you get the access. Of course you can rewrite it to use another minor and thus giving access to the iospace when /dev/mem is opened, e.g.
NOTE: The /dev/io entry has not been added to /dev/MAKEDEV yet. The patch is in NetBSD.
|
#
200 |
|
27-Jul-1993 |
dg |
* Applied fixes from Bruce Evans to fix COW bugs, >1MB kernel loading, profiling, and various protection checks that cause security holes and system crashes. * Changed min/max/bcmp/ffs/strlen to be static inline functions - included from cpufunc.h in via systm.h. This change improves performance in many parts of the kernel - up to 5% in the networking layer alone. Note that this requires systm.h to be included in any file that uses these functions otherwise it won't be able to find them during the load. * Fixed incorrect call to splx() in if_is.c * Fixed bogus variable assignment to splx() in if_ed.c
|
#
134 |
|
16-Jul-1993 |
dg |
New locore from Christoph Rubitschko.
|
#
132 |
|
16-Jul-1993 |
dg |
Updated kernel files to move occurances of "struct args" syscall argument definitions outside of the function parameter list. This is to reduce the copious warning messages that (non-Jolitz) gcc produces. Also fixed some bogus variable declarations and casts to make the compiler happy.
|
#
8 |
|
18-Jun-1993 |
paul |
Upgrade to GCC 2.X
|
#
5 |
|
12-Jun-1993 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r4, which included commits to RCS files with non-trunk default branches.
|
#
4 |
|
12-Jun-1993 |
rgrimes |
Initial import, 0.1 + pk 0.2.4-B1
|