267654 |
20-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
263688 |
24-Mar-2014 |
emaste |
MFC r263289: Update NetBSD Foundation copyrights to 2-clause BSD
The NetBSD Foundation states "Third parties are encouraged to change the license on any files which have a 4-clause license contributed to the NetBSD Foundation to a 2-clause license."
This change removes clauses 3 and 4 from copyright / license blocks that list The NetBSD Foundation as the only copyright holder.
Sponsored by: The FreeBSD Foundation
|
232387 |
02-Mar-2012 |
kib |
MFC r231885: Fix misuse of the kernel map in miscellaneous image activators. Vnode-backed mappings cannot be put into the kernel map, since it is a system map.
|
225736 |
23-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
225618 |
16-Sep-2011 |
kmacy |
Auto-generated code from sys_ prefixing makesyscalls.sh change
Approved by: re(bz)
|
225617 |
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
224778 |
11-Aug-2011 |
rwatson |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent.
Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
220515 |
10-Apr-2011 |
trasz |
Remove stray semicolon.
|
220373 |
05-Apr-2011 |
trasz |
Add accounting for most of the memory-related resources.
Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
219405 |
08-Mar-2011 |
dchagin |
Extend struct sysvec with new method sv_schedtail, which is used for an explicit process at fork trampoline path instead of eventhadler(schedtail) invocation for each child process.
Remove eventhandler(schedtail) code and change linux ABI to use newly added sysvec method.
While here replace explicit comparing of module sysentvec structure with the newly created process sysentvec to detect the linux ABI.
Discussed with: kib
MFC after: 2 Week
|
213716 |
12-Oct-2010 |
kib |
Add macro DECLARE_MODULE_TIED to denote a module as requiring the kernel of exactly the same __FreeBSD_version as the headers module was compiled against.
Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules use kernel interfaces that the Release Engineering Team feel are not stable enough to guarantee they will not change during the life cycle of a STABLE branch. In particular, the layout of struct sysentvec is declared to be not part of the STABLE KBI.
Discussed with: bz, rwatson Approved by: re (bz, kensmith) MFC after: 2 weeks
|
210197 |
17-Jul-2010 |
trasz |
Remove proc locking, it's not needed after r210132.
|
210132 |
15-Jul-2010 |
trasz |
Make svr4(4) version of poll(2) use the same limit of file descriptors as the usual poll(2) does, instead of checking resource limits.
|
209581 |
28-Jun-2010 |
kib |
Regenerate
|
208453 |
23-May-2010 |
kib |
Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names.
Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval().
The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers.
Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret().
Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls.
The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation.
Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
|
205792 |
28-Mar-2010 |
ed |
Rename st_*timespec fields to st_*tim for POSIX 2008 compliance.
A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names.
This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous.
I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.
|
203660 |
08-Feb-2010 |
ed |
Remove unused LIBCOMPAT keyword from syscalls.master.
|
202143 |
12-Jan-2010 |
brooks |
Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications.
MFC after: 1 month
|
199882 |
28-Nov-2009 |
ed |
Include <sys/tty.h> instead of <sys/termios.h>.
Right now <sys/termios.h> includes <sys/ttycom.h>, which provides the TTY ioctls to the svr4 code. We need both struct termios and the ioctls, so include <sys/tty.h> for now.
|
197064 |
10-Sep-2009 |
des |
As jhb@ pointed out to me, r197057 was incorrect, not least because these are generated files.
|
196019 |
01-Aug-2009 |
rwatson |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes.
Reviewed by: bz Approved by: re (vimage blanket)
|
195699 |
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
195458 |
08-Jul-2009 |
trasz |
There is an optimization in chmod(1), that makes it not to call chmod(2) if the new file mode is the same as it was before; however, this optimization must be disabled for filesystems that support NFSv4 ACLs. Chmod uses pathconf(2) to determine whether this is the case - however, pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.
This change adds lpathconf(3) to make it possible to solve that problem in a clean way.
Reviewed by: rwatson (earlier version) Approved by: re (kib)
|
194910 |
24-Jun-2009 |
jhb |
Change the ABI of some of the structures used by the SYSV IPC API: - The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned short. - The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned short. - The mode member of struct ipc_perm is now mode_t instead of unsigned short (this is merely a style bug). - The rather dubious padding fields for ABI compat with SV/I386 have been removed from struct msqid_ds and struct semid_ds. - The shm_segsz member of struct shmid_ds is now a size_t instead of an int. This removes the need for the shm_bsegsz member in struct shmid_kernel and should allow for complete support of SYSV SHM regions >= 2GB. - The shm_nattch member of struct shmid_ds is now an int instead of a short. - The shm_internal member of struct shmid_ds is now gone. The internal VM object pointer for SHM regions has been moved into struct shmid_kernel. - The existing __semctl(), msgctl(), and shmctl() system call entries are now marked COMPAT7 and new versions of those system calls which support the new ABI are now present. - The new system calls are assigned to the FBSD-1.1 version in libc. The FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls. - A simplistic framework for tagging system calls with compatibility symbol versions has been added to libc. Version tags are added to system calls by adding an appropriate __sym_compat() entry to src/lib/libc/incldue/compat.h. [1]
PR: kern/16195 kern/113218 bin/129855 Reviewed by: arch@, rwatson Discussed with: kan, kib [1]
|
194739 |
23-Jun-2009 |
bz |
After cleaning up rt_tables from vnet.h and cleaning up opt_route.h a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
|
194090 |
13-Jun-2009 |
jamie |
Add counterparts to getcredhostname: getcreddomainname, getcredhostuuid, getcredhostid
Suggested by: rmacklem Approved by: bz
|
193744 |
08-Jun-2009 |
bz |
After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds.
Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
|
193511 |
05-Jun-2009 |
rwatson |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
|
193235 |
01-Jun-2009 |
rwatson |
Regenerate generated syscall files following changes to struct sysent in r193234.
|
193084 |
30-May-2009 |
delphij |
Attempt to fix build by updating hostid to follow the new world order.
|
193066 |
29-May-2009 |
jamie |
Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible.
The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed.
Approved by: bz (mentor)
|
193017 |
29-May-2009 |
delphij |
Implement SI_ISALIST.
PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD
|
193016 |
29-May-2009 |
delphij |
Fix the sysinfo(SI_HW_SERIAL, emulation so that we actually get the hostid of the machine rather than always getting "0".
PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD
|
193015 |
29-May-2009 |
delphij |
copyinstr(9) takes parameter 'len' as a size_t *, not int *.
PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD
|
193014 |
29-May-2009 |
delphij |
de-register.
Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD PR: kern/91293
|
193013 |
29-May-2009 |
delphij |
svr4_sys_getdents64() should not assume that the cookie would exist everywhere.
PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD
|
193012 |
29-May-2009 |
delphij |
Add new sysconfig bits, Fix the bogus numbering of the old bits.
Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD PR: kern/91293
|
192994 |
28-May-2009 |
delphij |
Use strlcpy().
|
192459 |
20-May-2009 |
jhb |
Comment nits.
|
192455 |
20-May-2009 |
jhb |
Put the vnode returned from namei() immediately after namei() returns in svr4_sys_resolvepath().
|
191921 |
08-May-2009 |
ed |
Regenerate system call tables to use SVN ids.
|
191919 |
08-May-2009 |
ed |
Burn TTY ioctl bridges in compat layers.
I really don't want any pieces of code to include ioctl_compat.h, so let the ibcs2 and svr4 compat leave sgtty alone. If they want to support sgtty, they should emulate it on top of termios, not sgtty.
The code has been marked with BURN_BRIDGES for a long time. ibcs2 and svr4 are not really popular pieces of code anyway.
|
191915 |
08-May-2009 |
zec |
Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import.
As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet.
Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch.
Permit kldload / kldunload operations to be executed only from the default vimage context.
This change should have no functional impact on nooptions VIMAGE kernel builds.
Reviewed by: bz Approved by: julian (mentor)
|
189771 |
13-Mar-2009 |
dchagin |
Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section.
The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through.
Move code which fetch osreldate for ELF binary to check_note() handler.
PR: 118473 Approved by: kib (mentor)
|
189106 |
27-Feb-2009 |
bz |
For all files including net/vnet.h directly include opt_route.h and net/route.h.
Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.
We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong.
This does not change the list of files which depend on opt_route.h but we can identify them now more easily.
|
188588 |
13-Feb-2009 |
jhb |
Use shared vnode locks when invoking VOP_READDIR().
MFC after: 1 month
|
187830 |
28-Jan-2009 |
ed |
Last step of splitting up minor and unit numbers: remove minor().
Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev().
We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check.
Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now.
I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.
|
186225 |
17-Dec-2008 |
kib |
Remove two remnant uses of AT_DEBUG.
|
185571 |
02-Dec-2008 |
bz |
Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
|
185169 |
22-Nov-2008 |
kib |
Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features.
Discussed with: dchagin, imp, jhb, peter
|
183550 |
02-Oct-2008 |
zec |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit
Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs.
Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().
Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.).
All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*).
(*) netipsec/keysock.c did not validate depending on compile time options.
Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
183322 |
24-Sep-2008 |
kib |
Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc.
Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs.
No objection from: jhb MFC after: 1 month
|
183040 |
15-Sep-2008 |
ed |
Allow COMPAT_SVR4 to be built without COMPAT_43.
It seems we only depend on COMPAT_43 to implement the send() and recv() routines. We can easily implement them using sendto() and recvfrom(), just like we do inside our very own C library.
I wasn't able to really test it, apart from simple compilation testing. I've heard rumours that COMPAT_SVR4 is broken inside execve() anyway. It's still worth to fix this, because I suspect we'll get rid of COMPAT_43 somewhere in the future...
Reviewed by: rdivacky Discussed with: jhb
|
182371 |
28-Aug-2008 |
attilio |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
182145 |
25-Aug-2008 |
julian |
We left out V_static_len from ip_fw2.c (also a whitespace diff that i'd rahter fix her ethan break in the vimage branch.)
|
181803 |
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
180291 |
05-Jul-2008 |
rwatson |
Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates.
Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted.
MFC after: 3 weeks
|
178391 |
21-Apr-2008 |
rdivacky |
The vmspace->vm_daddr is constant until freed, there is no need to hold lock while accessing it.
Approved by: kib (mentor)
|
177997 |
08-Apr-2008 |
kib |
Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat.
Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho
|
177785 |
31-Mar-2008 |
kib |
Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9).
Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
|
177633 |
26-Mar-2008 |
dfr |
Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf.
Highlights include:
* Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts.
* Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation.
* Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux.
* Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket.
* Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock.
* Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers.
Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks
|
177314 |
17-Mar-2008 |
antoine |
Simplify fcntl(SVR4_F_DUP2FD) code now that FreeBSD has F_DUP2FD.
Approved by: rwatson (mentor)
|
177127 |
12-Mar-2008 |
jeff |
- The P_SA flag has been removed. Don't reference it in a KASSERT.
|
175294 |
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
175202 |
10-Jan-2008 |
attilio |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
174988 |
30-Dec-2007 |
jeff |
Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them.
Tested by: kris, pho
|
173361 |
05-Nov-2007 |
kib |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
|
172930 |
24-Oct-2007 |
rwatson |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
170587 |
12-Jun-2007 |
rwatson |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp Obtained from: TrustedBSD Project
|
170472 |
09-Jun-2007 |
attilio |
rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way.
Reviewed by: jeff Approved by: jeff (mentor)
|
170466 |
09-Jun-2007 |
attilio |
The current rusage code show peculiar problems: - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations
This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed.
A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process.
Submitted by: jeff Approved by: jeff (mentor)
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170170 |
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
169667 |
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
168355 |
04-Apr-2007 |
rwatson |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead.
- Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks.
- Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively.
- Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb).
- Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date.
In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio).
Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
164199 |
11-Nov-2006 |
ru |
Regen.
Forgotten by: trhodes
|
164033 |
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
163606 |
22-Oct-2006 |
rwatson |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
161860 |
02-Sep-2006 |
rwatson |
Remove two hypothetical calls to suser() in ifdef'd (and uncompilable) svr4 code: this code would call centralized sysctl code that does these checks also.
MFC after: 1 week Obtained from: TrustedBSD Project Sponsored by: nCircle Network Security, Inc.
|
161330 |
15-Aug-2006 |
jhb |
Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier systrace changes.
|
161328 |
15-Aug-2006 |
jhb |
- Remove unused sysvec variables from various syscalls.conf. - Send the systrace_args files for all the compat ABIs to /dev/null for now. Right now makesyscalls.sh generates a file with a hardcoded function name, so it wouldn't work for any of the ABIs anyway. Probably the function name should be configurable via a 'systracename' variable and the functions should be stored in a function pointer in the sysvec structure.
|
161012 |
05-Aug-2006 |
rwatson |
With socket code no longer in svr4_stream.c, MAC includes are no longer required, so GC.
|
160980 |
04-Aug-2006 |
brooks |
Use TAILQ_EMPTY instead of checking if TAILQ_FIRST is NULL.
|
160799 |
28-Jul-2006 |
jhb |
Regen for MPSAFE flag removal.
|
160798 |
28-Jul-2006 |
jhb |
Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.
|
160795 |
28-Jul-2006 |
jhb |
Regen.
|
160794 |
28-Jul-2006 |
jhb |
- Explicitly lock Giant to protect the fields in the svr4_strm structure except for s_family (which is read-only once after it is set when the structure is created). - Mark svr4_sys_ioctl(), svr4_sys_getmsg(), and svr4_sys_putmsg() MPSAFE.
|
160765 |
27-Jul-2006 |
jhb |
Fix a file descriptor race I reintroduced when I split accept1() up into kern_accept() and accept1(). If another thread closed the new file descriptor and the first thread later got an error trying to copyout the socket address, then it would attempt to close the wrong file object. To fix, add a struct file ** argument to kern_accept(). If it is non-NULL, then on success kern_accept() will store a pointer to the new file object there and not release any of the references. It is up to the calling code to drop the references appropriately (including a call to fdclose() in case of error to safely handle the aforementioned race). While I'm at it, go ahead and fix the svr4 streams code to not leak the accept fd if it gets an error trying to copyout the streams structures.
|
160559 |
21-Jul-2006 |
jhb |
Regen.
|
160558 |
21-Jul-2006 |
jhb |
Clean up the svr4 socket cache and streams code some to make it more easily locked. - Move all the svr4 socket cache code into svr4_socket.c, specifically move svr4_delete_socket() over from streams.c. Make the socket cache entry structure and svr4_head private to svr4_socket.c as a result. - Add a mutex to protect the svr4 socket cache. - Change svr4_find_socket() to copy the sockaddr_un struct into a caller-supplied sockaddr_un rather than giving the caller a pointer to our internal one. This removes the one case where code outside of svr4_socket.c could access data in the cache. - Add an eventhandler for process_exit and process_exec to purge the cache of any entries for the exiting or execing process. - Add methods to init and destroy the socket cache and call them from the svr4 ABI module's event handler. - Conditionally grab Giant around socreate() in streamsopen(). - Use fdclose() instead of inlining it in streamsopen() when handling socreate() failure. - Only allocate a stream structure and attach it to a socket in streamsopen(). Previously, if a svr4 program performed a stream operation on an arbitrary socket not opened via the streams device, we would attach streams state data to it and change f_ops of the associated struct file while it was in use. The latter was especially not safe, and if a program wants a stream object it should open it via the streams device anyway. - Don't bother locking so_emuldata in the streams code now that we only touch it right after creating a socket (in streamsopen()) or when tearing it down when the file is closed. - Remove D_NEEDGIANT from the streams device as it is no longer needed.
|
160557 |
21-Jul-2006 |
jhb |
Add conditional VFS Giant locking to svr4_sys_fchroot() and mark it MPSAFE. Also, call change_dir() instead of doing part of it inline (this now adds a mac_check_vnode_chdir() call) to match fchdir() and call mac_check_vnode_chroot() to match chroot(). Also, use the change_root() function to do the actual change root to match chroot().
Reviewed by: rwatson
|
160512 |
19-Jul-2006 |
jhb |
Regen.
|
160511 |
19-Jul-2006 |
jhb |
Add conditional VFS Giant locking to svr4_sys_resolvepath() and mark it MPSAFE.
|
160510 |
19-Jul-2006 |
jhb |
Make svr4_sys_waitsys() a lot less ugly and mark it MPSAFE. - If the WNOWAIT flag isn't specified and either of WEXITED or WTRAPPED is set, then just call kern_wait() and let it do all the work. This means that this function no longer has to duplicate the work to teardown zombies that is done in kern_wait(). Instead, if the above conditions aren't true, then it uses a simpler loop to implement WNOWAIT and/or tracing for only stopped or continued processes. This function still has to duplicate code from kern_wait() for the latter two cases, but those are much simpler. - Sync the code to handle the WCONTINUED and WSTOPPED cases with the equivalent code in kern_wait(). - Fix several places that would return with the proctree lock still held. - Lock the current process to prevent lost wakeup races when blocking.
|
160504 |
19-Jul-2006 |
jhb |
Initialize svr4_head during MOD_LOAD rather than on demand.
|
160277 |
11-Jul-2006 |
jhb |
Regen.
|
160276 |
11-Jul-2006 |
jhb |
- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.
|
160249 |
10-Jul-2006 |
jhb |
- Split out kern_accept(), kern_getpeername(), and kern_getsockname() for use by ABI emulators. - Alter the interface of kern_recvit() somewhat. Specifically, go ahead and hard code UIO_USERSPACE in the uio as that's what all the callers specify. In place, add a new uioseg to indicate what type of pointer is in mp->msg_name. Previously it was always a userland address, but ABI emulators may pass in kernel-side sockaddrs. Also, remove the namelenp field and instead require the two places that used it to explicitly copy mp->msg_namelen out to userland. - Use the patched kern_recvit() to replace svr4_recvit() and the stock kern_sendit() to replace svr4_sendit(). - Use kern_bind() instead of stackgap use in ti_bind(). - Use kern_getpeername() and kern_getsockname() instead of stackgap in svr4_stream_ti_ioctl(). - Use kern_connect() instead of stackgap in svr4_do_putmsg(). - Use kern_getpeername() and kern_accept() instead of stackgap in svr4_do_getmsg(). - Retire the stackgap from SVR4 compat as it is no longer used.
|
160187 |
08-Jul-2006 |
jhb |
Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This mostly consists of pushing a few copyin's and copyout's up into __semctl() as all the other callers were already doing the UIO_SYSSPACE case. This also changes kern_semctl() to set the return value in a passed in pointer to a register_t rather than td->td_retval[0] directly so that callers can only set td->td_retval[0] if all the various copyout's succeed.
As a result of these changes, kern_semctl() no longer does copyin/copyout (except for GETALL/SETALL) so simplify the locking to acquire the semakptr mutex before the MAC check and hold it all the way until the end of the big switch statement. The GETALL/SETALL cases have to temporarily drop it while they do copyin/malloc and copyout. Also, simplify the SETALL case to remove handling for a non-existent race condition.
|
160141 |
06-Jul-2006 |
jhb |
Don't try to copyin extra data for IPC_RMID requests to msgctl() or shmctl(). None of the other ABI's do this (including the native FreeBSD ABI), and uselessly trying to do a copyin() can actually result in a bogus EFAULT if the a process specifies NULL for the optional argument (which is what they should do in this case).
|
160063 |
01-Jul-2006 |
markm |
Housekeeping. Update for maintainers who have handed in their commit bits or (in my case) no longer feel that oversight is necessary.
|
159994 |
27-Jun-2006 |
jhb |
Regen.
|
159993 |
27-Jun-2006 |
jhb |
Use kern_shmctl() in svr4_sys_shmctl() and drop use of the stackgap. Mark svr4_sys_shmctl() MPSAFE.
|
159991 |
27-Jun-2006 |
jhb |
- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.
|
159961 |
26-Jun-2006 |
jhb |
Regen.
|
159960 |
26-Jun-2006 |
jhb |
Change svr4_sys_break() to just call obreak() and mark it MPSAFE.
Not objected to by: alc
|
157369 |
01-Apr-2006 |
rwatson |
Annotate uses of fgetsock() with indications that they should rely on their existing file descriptor references to sockets, rather than use fgetsock() to retrieve a direct socket reference.
MFC after: 3 months
|
155402 |
06-Feb-2006 |
jhb |
- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.
|
151463 |
19-Oct-2005 |
davidxu |
Fix compiling problem by adding prefix name svr4 to si_xxx macro, the si_xxx macro should not be used in compat headers, as these are standard member names or only can be used in our native header file signal.h.
|
151316 |
14-Oct-2005 |
davidxu |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed.
6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
150663 |
28-Sep-2005 |
rwatson |
Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:
Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant.
This aligns this code more with the eventual locking of ttys.
Suggested by: bde
|
150335 |
19-Sep-2005 |
rwatson |
Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment).
Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout.
With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant.
NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable.
NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change.
NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code.
MFC after: 1 week
|
148887 |
09-Aug-2005 |
rwatson |
Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so.
Reviewed by: pjd, bz MFC after: 7 days
|
148541 |
29-Jul-2005 |
jhb |
Add missing dependencies on the SYSVIPC modules.
|
147975 |
13-Jul-2005 |
jhb |
Regen.
|
147974 |
13-Jul-2005 |
jhb |
Make a pass through all the compat ABIs sychronizing the MP safe flags with the master syscall table as well as marking several ABI wrapper functions safe.
MFC after: 1 week
|
147966 |
13-Jul-2005 |
jhb |
Regen.
|
147965 |
13-Jul-2005 |
jhb |
- Stop hardcoding #define's for options and use the appropriate opt_foo.h headers instead. - Hook up the IPC SVR4 syscalls.
MFC after: 3 days
|
147819 |
07-Jul-2005 |
jhb |
Lock Giant in svr4_add_socket() so that the various svr4_*stat() calls can be marked MP safe as this is the only part of them that is not already MP safe.
Approved by: re (scottl)
|
147818 |
07-Jul-2005 |
jhb |
Remove an unused syscallarg() macro leftover from this code's origins in NetBSD.
Approved by: re (scottl)
|
147817 |
07-Jul-2005 |
jhb |
Rototill this file so that it actually compiles. It doesn't do anything in the build still due to some #undef's in svr4.h, but if you hack around that and add some missing entries to syscalls.master, then this file will now compile. The changes involved proc -> thread, using FreeBSD syscall names instead of NetBSD, and axeing syscallarg() and retval arguments.
Approved by: re (scottl)
|
146807 |
30-May-2005 |
rwatson |
Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format.
Submitted by: wsalamon Obtained from: TrustedBSD Project
|
146806 |
30-May-2005 |
rwatson |
Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc.
Submitted by: wsalamon Obtained from: TrustedBSD Project
|
144501 |
01-Apr-2005 |
jhb |
- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from.
Reviewed by: alc, arch@
|
144014 |
23-Mar-2005 |
das |
Bounds check the user-supplied length used in a copyout() in svr4_do_getmsg(). In principle this bug could disclose data from kernel memory, but in practice, the SVR4 emulation layer is probably not functional enough to cause the relevant code path to be executed. In any case, the emulator has been disconnected from the build since 5.0-RELEASE.
Found by: Coverity Prevent analysis tool
|
142500 |
25-Feb-2005 |
sam |
fixup signal mapping: o change the mapping arrays to have a zero offset rather than base 1; this eliminates lots of signo adjustments and brings the code back inline with the original netbsd code o purge use of SVR4_SIGTBLZ; SVR4_NSIG is the only definition for how big a mapping array is o change the mapping loops to explicitly ignore signal 0 o purge some bogus code from bsd_to_svr4_sigset o adjust svr4_sysentvec to deal with the mapping table change
Enticed into fixing by: Coverity Prevent analysis tool Glanced at by: marcel, jhb
|
141815 |
13-Feb-2005 |
sobomax |
Backout previous change (disabling of security checks for signals delivered in emulation layers), since it appears to be too broad.
Requested by: rwatson
|
141812 |
13-Feb-2005 |
sobomax |
Split out kill(2) syscall service routine into user-level and kernel part, the former is callable from user space and the latter from the kernel one. Make kernel version take additional argument which tells if the respective call should check for additional restrictions for sending signals to suid/sugid applications or not.
Make all emulation layers using non-checked version, since signal numbers in emulation layers can have different meaning that in native mode and such protection can cause misbehaviour.
As a result remove LIBTHR from the signals allowed to be delivered to a suid/sugid application.
Requested (sorta) by: rwatson MFC after: 2 weeks
|
141486 |
07-Feb-2005 |
jhb |
- Implement svr4_emul_find() using kern_alternate_path(). This changes the semantics in that the returned filename to use is now a kernel pointer rather than a user space pointer. This required changing the arguments to the CHECKALT*() macros some and changing the various system calls that used pathnames to use the kern_foo() functions that can accept kernel space filename pointers instead of calling the system call directly. - Use kern_open(), kern_access(), kern_msgctl(), kern_execve(), kern_mkfifo(), kern_mknod(), kern_statfs(), kern_fstatfs(), kern_setitimer(), kern_stat(), kern_lstat(), kern_fstat(), kern_utimes(), kern_pathconf(), and kern_unlink().
|
140992 |
29-Jan-2005 |
sobomax |
o Split out kernel part of execve(2) syscall into two parts: one that copies arguments into the kernel space and one that operates completely in the kernel space;
o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386.
Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks
|
139743 |
05-Jan-2005 |
imp |
Start each of the license/copyright comments with /*-
|
139739 |
05-Jan-2005 |
jhb |
- Move the function prototypes for kern_setrlimit() and kern_wait() to sys/syscallsubr.h where all the other kern_foo() prototypes live. - Resort kern_execve() while I'm there.
|
138129 |
27-Nov-2004 |
das |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
137647 |
13-Nov-2004 |
phk |
Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.
Use this in all the places where sleeping with the lock held is not an issue.
The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
|
137339 |
07-Nov-2004 |
phk |
More sensible FILEDESC_ locking.
|
136152 |
05-Oct-2004 |
jhb |
Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand.
- Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console.
Submitted by: bde (mostly) MFC after: 1 month
|
135762 |
24-Sep-2004 |
jhb |
Add a proc *p pointer for td->td_proc to make this code easier to read.
|
135573 |
22-Sep-2004 |
jhb |
Various small style fixes.
|
134267 |
24-Aug-2004 |
jhb |
Regenerate after fcntl() wrappers were marked MP safe.
|
134266 |
24-Aug-2004 |
jhb |
Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl() directly. This removes a few more users of the stackgap and also marks the syscalls using these wrappers MP safe where appropriate.
Tested on: i386 with linux acroread5 Compiled on: i386, alpha LINT
|
132199 |
15-Jul-2004 |
phk |
Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events.
A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
|
131897 |
10-Jul-2004 |
phk |
Clean up and wash struct iovec and struct uio handling.
Add copyiniov() which copies a struct iovec array in from userland into a malloc'ed struct iovec. Caller frees.
Change uiofromiov() to malloc the uio (caller frees) and name it copyinuio() which is more appropriate.
Add cloneuio() which returns a malloc'ed copy. Caller frees.
Use them throughout.
|
131013 |
24-Jun-2004 |
obrien |
Cast variable-sized (based on platform) quantities before printing out.
|
130892 |
21-Jun-2004 |
phk |
Put the pre FreeBSD-2.x tty compat code under BURN_BRIDGES.
|
130640 |
17-Jun-2004 |
phk |
Second half of the dev_t cleanup.
The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev()
Various minor adjustments including handling of userland access to kernel space struct cdev etc.
|
130398 |
13-Jun-2004 |
rwatson |
Socket MAC labels so_label and so_peerlabel are now protected by SOCK_LOCK(so):
- Hold socket lock over calls to MAC entry points reading or manipulating socket labels.
- Assert socket lock in MAC entry point implementations.
- When externalizing the socket label, first make a thread-local copy while holding the socket lock, then release the socket lock to externalize to userspace.
|
127140 |
17-Mar-2004 |
jhb |
- Replace wait1() with a kern_wait() function that accepts the pid, options, status pointer and rusage pointer as arguments. It is up to the caller to copyout the status and rusage to userland if needed. This lets us axe the 'compat' argument and hide all that functionality in owait(), by the way. This also cleans up some locking in kern_wait() since it no longer has to drop locks around copyout() since all the copyout()'s are deferred. - Convert owait(), wait4(), and the various ABI compat wait() syscalls to use kern_wait() rather than wait1() or wait4(). This removes a bit more stackgap usage.
Tested on: i386 Compiled on: i386, alpha, amd64
|
125575 |
07-Feb-2004 |
phk |
I guess nobody has needed to use the SVR4olator to create device nodes, or if they did, they're now locked away on the Kurt Gdel memorial home for the numerically confused:
Don't cast a kernel pointer (from makedev(9)) to an integer (maj+minor combo).
|
125529 |
06-Feb-2004 |
jhb |
Regen.
|
125527 |
06-Feb-2004 |
jhb |
Sync up MP safe flags with global syscalls.master. This includes read(), write(), close(), getpid(), setuid(), getuid(), svr4_sys_pause(), svr4_sys_nice(), svr4_sys_kill(), svr4_sys_pgrpsys(), dup(), pipe(), setgid(), getgid(), svr4_sys_signal(), umask(), getgroups(), setgroups(), svr4_sys_sigprocmask(), svr4_sys_sigsuspend(), svr4_sys_sigaltstack(), svr4_sys_sigaction(), svr4_sys_sigpending(), mprotect(), munmap(), setegid(), seteuid(), setreuid(), setregid().
|
125457 |
04-Feb-2004 |
jhb |
Regen.
|
125455 |
04-Feb-2004 |
jhb |
The following compat syscalls are now mpsafe: linux_getrlimit(), linux_setrlimit(), linux_old_getrlimit(), osf1_getrlimit(), osf1_setrlimit(), svr4_sys_ulimit(), svr4_sys_setrlimit(), svr4_sys_getrlimit(), svr4_sys_setrlimit64(), svr4_sys_getrlimit64(), ibcs2_sysconf(), and ibcs2_ulimit().
|
125454 |
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
124802 |
21-Jan-2004 |
rwatson |
Reduce gratuitous includes: don't include jail.h if it's not needed. Presumably, at some point, you had to include jail.h if you included proc.h, but that is no longer required.
Result of: self injury involving adding something to struct prison
|
123790 |
24-Dec-2003 |
peter |
GC unused 'syshide' override to /dev/null. This was here to disable the output of the namespc column. Its functionality was removed some time ago, but the overrides and the namespc column remained.
|
123784 |
24-Dec-2003 |
peter |
Regen. This should have been a NOP, but its not been regenerated for ages and is missing the changes from the last few makesyscalls.sh revisions.
|
123783 |
24-Dec-2003 |
peter |
GC OBE namespc column and un-wrap longer lines that now fit
|
123742 |
23-Dec-2003 |
peter |
Add an additional field to the elf brandinfo structure to support quicker exec-time replacement of the elf interpreter on an emulation environment where an entire /compat/* tree isn't really warranted.
|
122892 |
19-Nov-2003 |
kan |
Do not call VOP_GETATTR in getdents function. It does not serve any purpose and the resulting vattr structure was ignored. In addition, the VOP_GETATTR call was made with no vnode lock held, resulting in vnode locking violation panic with debug kernels.
Reported by: truckman
Approved by: re@ (rwatson)
|
121275 |
20-Oct-2003 |
tjr |
Fix some security bugs in the SVR4 emulator: - Return NULL instead of returning memory outside of the stackgap in stackgap_alloc() (FreeBSD-SA-00:42.linux) - Check for stackgap_alloc() returning NULL in svr4_emul_find(), and clean_pipe(). - Avoid integer overflow on large nfds argument in svr4_sys_poll() - Reject negative nbytes argument in svr4_sys_getdents() - Don't copy out past the end of the struct componentname pathname buffer in svr4_sys_resolvepath() - Reject out-of-range signal numbers in svr4_sys_sigaction(), svr4_sys_signal(), and svr4_sys_kill(). - Don't malloc() user-specified lengths in show_ioc() and show_strbuf(), place arbitrary limits instead. - Range-check lengths in si_listen(), ti_getinfo(), ti_bind(), svr4_do_putmsg(), svr4_do_getmsg(), svr4_stream_ti_ioctl().
Some fixes obtain from OpenBSD.
|
120422 |
25-Sep-2003 |
peter |
Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process.
Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.
Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy.
Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced.
Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does.
Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
|
118560 |
06-Aug-2003 |
phk |
Remove dangling extern reference to swap_pager_full
|
116678 |
22-Jun-2003 |
phk |
Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
116361 |
15-Jun-2003 |
davidxu |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
116174 |
10-Jun-2003 |
obrien |
Use __FBSDID().
|
115550 |
31-May-2003 |
phk |
Put definition of struct svr4_sockcache_entry in a .h file rather than having two independent definitions in two .c files. Fiddle surrounding details to match.
Found by: FlexeLint
|
114983 |
13-May-2003 |
jhb |
- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe.
Reviewed by: arch@ Approved by: re (rwatson)
|
114934 |
12-May-2003 |
phk |
Don't #define memset() to bzero(), it is far too prone to bite somebody.
Approved by: re/scottl
|
113859 |
22-Apr-2003 |
jhb |
- Replace inline implementations of sigprocmask() with calls to kern_sigprocmask() in the various binary compatibility emulators. - Replace calls to sigsuspend(), sigaltstack(), sigaction(), and sigprocmask() that used the stackgap with calls to the corresponding kern_sig*() functions instead without using the stackgap.
|
113616 |
17-Apr-2003 |
jhb |
The proc lock is sufficient to test p_state against PRS_ZOMBIE, so don't needlessly lock sched_lock.
|
113613 |
17-Apr-2003 |
jhb |
Use local struct proc variables to reduce repeated td->td_proc dereferences and improve readability.
|
112888 |
31-Mar-2003 |
jeff |
- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
|
112470 |
21-Mar-2003 |
jhb |
Sync up linux and svr compat elf fixup functions for exec(). These functions are now all basically identical except that alpha linux uses Elf64 arguments and svr4 and i386 linux use Elf32. The fixups include changing the first argument to be a register_t ** to match the prototype for fixup functions, asserting that the process in the image_params struct is always curproc and removing unnecessary locking to read credentials as a result, and a few style fixes.
|
111748 |
02-Mar-2003 |
des |
More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9).
|
111119 |
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
110232 |
02-Feb-2003 |
alfred |
Consolidate MIN/MAX macros into one place (param.h).
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
|
109623 |
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
109254 |
14-Jan-2003 |
dillon |
Add missing #include
Submitted by: "Sam Leffler" <sam@errno.com>
|
109203 |
13-Jan-2003 |
dillon |
Apply bandaid to bring svr4_sys_waitsys() in line with exit1(). This routine really need to be gutted and merged with exit1().
Reviewed by: jhb
|
109153 |
13-Jan-2003 |
dillon |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
109123 |
12-Jan-2003 |
dillon |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it.
Change struct xfile xf_data to xun_data (ABI is still compatible).
If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
108172 |
22-Dec-2002 |
hsu |
SMP locking for ifnet list.
|
107849 |
14-Dec-2002 |
alfred |
SCARGS removal take II.
|
107839 |
13-Dec-2002 |
alfred |
Backout removal SCARGS, the code freeze is only "selectively" over.
|
107838 |
13-Dec-2002 |
alfred |
Remove SCARGS.
Reviewed by: md5
|
105360 |
17-Oct-2002 |
robert |
Replace the conventional usage of strncpy() by using strlcpy().
|
104571 |
06-Oct-2002 |
rwatson |
Integrate mac_check_socket_send() and mac_check_socket_receive() checks from the MAC tree: allow policies to perform access control for the ability of a process to send and receive data via a socket. At some point, we might also pass in additional address information if an explicit address is requested on send.
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
104306 |
01-Oct-2002 |
jmallett |
Back our kernel support for reliable signal queues.
Requested by: rwatson, phk, and many others
|
104233 |
30-Sep-2002 |
jmallett |
First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs.
After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such.
CODAFS is unfinished in this regard because the logic is unclear in some places.
Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]
|
103886 |
24-Sep-2002 |
mini |
Back out last commit. Linux uses the old 4.3BSD sockaddr format.
|
103873 |
23-Sep-2002 |
jhb |
Ok, make this compile for real this time. recvfrom_args doesn't have a fromlen member, instead it has a fromlenaddr pointer member. Set it to NULL.
|
103872 |
23-Sep-2002 |
jhb |
Use correct variable name so that previous commit actually compiles.
|
103839 |
23-Sep-2002 |
mini |
Don't use compatability syscall wrappers in emulation code. This is needed for the COMPAT_FREEBSD3 option split.
Reviewed by: alfred, jake
|
103767 |
21-Sep-2002 |
jake |
Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
|
102808 |
01-Sep-2002 |
jake |
Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections to sysentvec. Initialized all fields of all sysentvecs, which will allow them to be used instead of constants in more places. Provided stack fixup routines for emulations that previously used the default.
|
102727 |
31-Aug-2002 |
jake |
Make this compile.
|
102630 |
30-Aug-2002 |
dillon |
Implement data, text, and vmem limit checking in the elf loader and svr4 compat code. Clean up accounting for multiple segments. Part 1/2.
Submitted by: Andrey Alekseyev <uitm@zenon.net> (with some modifications) MFC after: 3 days
|
102003 |
17-Aug-2002 |
rwatson |
In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation.
- Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR().
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101924 |
15-Aug-2002 |
rwatson |
On MAC check failure for readdir, use 'goto out' to use the common exit handling, rather than returning directly to prevent leaking of vnode reference/lock.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101846 |
13-Aug-2002 |
jeff |
- Add the missing td argument to vn_lock that I missed in my last commit.
|
101771 |
13-Aug-2002 |
jeff |
- Hold the vnode lock throughout execve. - Set VV_TEXT in the top level execve code. - Fixup the image activators to deal with the newly locked vnode.
|
101709 |
12-Aug-2002 |
rwatson |
Enforce MAC policies for the locally implemented vnode services in SVR4 emulation relating to readdir() and fd_revoke(). All other services appear to be implemented by simply wrapping existing FreeBSD native system call implementations, so don't require local instrumentation in the emulator module.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101308 |
04-Aug-2002 |
jeff |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking.
Idea stolen from: BSD/OS
|
100384 |
20-Jul-2002 |
peter |
Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
|
99669 |
09-Jul-2002 |
robert |
The comment marked with XXX was right: emulate SVR4 for ELF binaries branded with ELFOSABI_SYSV, this is reported to work and brandelf(1) puts this type into files if "SVR4" was specified.
|
99072 |
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
98124 |
11-Jun-2002 |
alfred |
catch up with ktrace changes, KTRPOINT takes a 'struct thread' not 'struct proc' now.
|
97994 |
07-Jun-2002 |
jhb |
Catch up to changes in ktrace API.
|
97748 |
02-Jun-2002 |
schweikh |
Fix typo in the BSD copyright: s/withough/without/
Spotted and suggested by: des MFC after: 3 weeks
|
97658 |
31-May-2002 |
tanimura |
Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by: hsu
|
97555 |
30-May-2002 |
alfred |
correct commented out preprocessor test for i386 to __i386__
|
97271 |
25-May-2002 |
bde |
Fixed a printf format error. It was old and should have been detected by gcc-2.9x, but somehow wasn't fixed already.
|
96972 |
20-May-2002 |
tanimura |
Lock down a socket, milestone 1.
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count - so_options - so_linger - so_state
o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket:
- sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup()
Reviewed by: alfred
|
94858 |
16-Apr-2002 |
jhb |
- Lock proctree_lock instead of pgrpsess_lock. - Exclusively lock proctree_lock while calling leavepgrp().
|
94455 |
11-Apr-2002 |
jhb |
Use proc lock to protect p_ucred pointer while we deference it to read a few values.
|
93793 |
04-Apr-2002 |
bde |
Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask.
Avoid locking in userret() in most of the remaining cases.
Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon
|
93593 |
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
93295 |
27-Mar-2002 |
alfred |
Make the reference counting of 'struct pargs' SMP safe.
There is still some locations where the PROC lock should be held in order to prevent inconsistent views from outside (like the proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be fixed later.
Submitted by: Jonathan Mini <mini@haikugeek.com>
|
92787 |
20-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
|
92761 |
20-Mar-2002 |
alfred |
Remove __P.
|
91406 |
27-Feb-2002 |
jhb |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
91392 |
27-Feb-2002 |
robert |
Use the updated getcredhostname() function.
|
91386 |
27-Feb-2002 |
robert |
- Use the new getcredhostname function in the SVR4 uname system call. - Remove spurious empty line.
Reviewed by: phk
|
91140 |
23-Feb-2002 |
tanimura |
Lock struct pgrp, session and sigio.
New locks are:
- pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members.
Please refer to sys/proc.h for the coverage of these locks.
Changes on the pgrp/session interface:
- pgfind() needs the pgrpsess_lock held.
- The caller of enterpgrp() is responsible to allocate a new pgrp and session.
- Call enterthispgrp() in order to enter an existing pgrp.
- pgsignal() requires a pgrp lock held.
Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
|
90002 |
30-Jan-2002 |
alfred |
include sys/lock.h and sys/mutex.h to make compile.
Noticed by: Vincent Poy <vince@oahu.WURLDLINK.NET>
|
89545 |
19-Jan-2002 |
tanimura |
Lock the caller process if the pid passed to getsid() or getpgid() equals to zero.
|
89541 |
19-Jan-2002 |
tanimura |
For getsid(), return the sid stored in struct session. This prevents panic in case where a session has no session leader.
Inspired by: Solaris 8
|
89536 |
19-Jan-2002 |
alfred |
Make compile, remove extra fdrop() calls. Change name of function to what it's supposed to be (s/sys/do)
|
89535 |
19-Jan-2002 |
alfred |
make compile, add missing { and variable declaration.
|
89534 |
19-Jan-2002 |
alfred |
Semi-backout previous fgetvp change, we need the struct file pointer to perform relative offset calculations, so use fget instead.
|
89411 |
16-Jan-2002 |
alfred |
fix typo, there's uap, just fd
|
89319 |
14-Jan-2002 |
alfred |
Replace ffind_* with fget calls.
Make fget MPsafe.
Make fgetvp and fgetsock use the fget subsystem to reduce code bloat.
Push giant down in fpathconf().
|
89308 |
13-Jan-2002 |
alfred |
Some of the KSE stuff was accidentally reverted by file locking, fix it.
Pointed out by: jhb
|
89306 |
13-Jan-2002 |
alfred |
SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.
I've polished it quite a bit reducing the need for locking and adapting it for KSE.
Locks:
1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked.
1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex.
1 sx lock for the global filelist.
struct file * fhold(struct file *fp); /* increments reference count on a file */
struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */
struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */
struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */
I still have to smp-safe the fget cruft, I'll get to that asap.
|
86487 |
17-Nov-2001 |
dillon |
Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.
|
84811 |
11-Oct-2001 |
jhb |
Add missing includes of sys/lock.h.
|
84783 |
10-Oct-2001 |
ps |
Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable.
Reviewed by: peter MFC after: 2 weeks
|
83418 |
13-Sep-2001 |
julian |
Fix typo. noticed by: jhb
|
83366 |
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
82753 |
01-Sep-2001 |
dillon |
Synchronize syscalls.master(s) with recent Giant pushdown work
|
80114 |
22-Jul-2001 |
assar |
get rid of some printf and pointer type warnings
|
77183 |
25-May-2001 |
rwatson |
o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account.
Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
|
76166 |
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
75893 |
24-Apr-2001 |
jhb |
Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning.
Reviewed by: -smp, dfr, jake
|
74940 |
28-Mar-2001 |
jhb |
Add missing includes of <sys/sx.h>
Reported by: peter
|
74927 |
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
73929 |
07-Mar-2001 |
jhb |
Grab the process lock while calling psignal and before calling psignal.
|
73907 |
07-Mar-2001 |
jhb |
- Hold both an exclusive proctree lock and the proc lock when reparenting a traced process during exit. - Lock the parent process while sending it SIGCHLD.
|
72999 |
24-Feb-2001 |
obrien |
MFS: bring the consistent `compat_3_brand' support into -CURRENT (the work was first done in the RELENG_4 branch near a release during a MFC to make the code cleaner and more consistent)
|
72786 |
21-Feb-2001 |
rwatson |
o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use.
Notes:
o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure.
Reviewed by: freebsd-arch Obtained from: TrustedBSD Project
|
72200 |
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
72091 |
06-Feb-2001 |
asmodai |
Fix typo: seperate -> separate.
Seperate does not exist in the english language.
|
72082 |
06-Feb-2001 |
asmodai |
Fix typo: wierd -> weird.
There is no such thing as wierd in the english language.
|
71699 |
27-Jan-2001 |
jhb |
Back out proc locking to protect p_ucred for obtaining additional references along with the actual obtaining of additional references.
|
71697 |
26-Jan-2001 |
jhb |
- Back out over-aggressive locking of p->p_cred. - Back out locking ucred's and bumping refcounts for vnode operations.
|
71456 |
23-Jan-2001 |
jhb |
Argh, atomic_store_rel -> atomic_store_rel_int.
|
71455 |
23-Jan-2001 |
jhb |
Woops, add in missing headers.
|
71454 |
23-Jan-2001 |
jhb |
Proc locking.
|
71453 |
23-Jan-2001 |
jhb |
Use queue macros.
|
71452 |
23-Jan-2001 |
jhb |
- Add proc locking. - Fix several bugs in the wait syscall, including freeing the actual proc start, freeing the args, freeing the prison, and other minor nits. - Use appropriate queue(3) macros. - Use zpfind() instead of walking zombproc ourselves.
|
71449 |
23-Jan-2001 |
jhb |
- Use proper atomic operations to make the run time initialization controlled by svr_str_initialized be MP safe.
|
71447 |
23-Jan-2001 |
jhb |
FreeBSD doesn't have p_emuldata, and our stackgap_init() doesn't take an argument.
|
71286 |
20-Jan-2001 |
wollman |
Finish deprecating <sys/select.h> in favor of <sys/selinfo.h> in kernel code.
|
70835 |
09-Jan-2001 |
green |
Take 10 seconds to actually fix the chgproccnt rather than just make it explicitly error. If the module is horribly broken, it should be temporarily removed from src/sys/modules.
|
70830 |
09-Jan-2001 |
wollman |
With some trepidation, add a `#error' directive to this module. It was broken and not fixed by whoever changed the interface of chgproccnt(); in the state it is in it could not possibly work (dereferencing an integer).
|
70317 |
23-Dec-2000 |
jake |
Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock.
linprocfs not locked pending response from informal maintainer.
Reviewed by: jhb, -smp@
|
69947 |
13-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
69541 |
03-Dec-2000 |
marcel |
Include machine/cpu.h for cpu_getstack().
Spotted by: jake
|
69379 |
30-Nov-2000 |
marcel |
Don't use p->p_sigstk.ss_flags to keep state of whether the process is on the alternate stack or not. For compatibility with sigstack(2) state is being updated if such is needed.
We now determine whether the process is on the alternate stack by looking at its stack pointer. This allows a process to siglongjmp from a signal handler on the alternate stack to the place of the sigsetjmp on the normal stack. When maintaining state, this would have invalidated the state information and causing a subsequent signal to be delivered on the normal stack instead of the alternate stack.
PR: 22286
|
69084 |
23-Nov-2000 |
dillon |
Forgot to patch this file in file descriptor race fix commit
Submitted-by: "Danny J. Zerkel" <dzerkel@columbus.rr.com>
|
68520 |
09-Nov-2000 |
marcel |
Make MINSIGSTKSZ machine dependent, and have the sigaltstack syscall compare against a variable sv_minsigstksz in struct sysentvec as to properly take the size of the machine- and ABI dependent struct sigframe into account.
The SVR4 and iBCS2 modules continue to have a minsigstksz of 8192 to preserve behavior. The real values (if different) are not known at this time. Other ABI modules use the real values.
The native MINSIGSTKSZ is now defined as follows:
Arch MINSIGSTKSZ ---- ----------- alpha 4096 i386 2048 ia64 12288
Reviewed by: mjacob Suggested by: bde
|
68157 |
01-Nov-2000 |
obrien |
Make the target a little bit more generic.
|
65302 |
31-Aug-2000 |
obrien |
Cleanup after repo copy of sys/svr4 to sys/compat/svr4.
|
64002 |
29-Jul-2000 |
peter |
Regen. (Fix SYS_exit)
|
64001 |
29-Jul-2000 |
peter |
Sigh. Fix SYS_exit problems. I misunderstood the significance of these trailing options.
|
63987 |
29-Jul-2000 |
peter |
Regenerate with makesyscalls.sh
|
63986 |
29-Jul-2000 |
peter |
Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping gcc's internal exit() prototypes and the (futile) hackery that we did to try and avoid warnings. main() was renamed for similar reasons. Remove an exit related hack from makesyscalls.sh.
|
62976 |
11-Jul-2000 |
mckusick |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed.
Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
|
62378 |
02-Jul-2000 |
green |
Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio * instead of a struct iovec * array and int len. Get rid of stupidly trying to allocate all of the memory and copyin()ing the entire iovec[], and instead just do the proper VOP_WRITE() in ktrwrite() using a copy of the struct uio that the syscall originally used.
This solves the DoS which could easily be performed; to work around the DoS, one could also remove "options KTRACE" from the kernel. This is a very strong MFC candidate for 4.1.
Found by: art@OpenBSD.org
|
61976 |
22-Jun-2000 |
alfred |
fix races in the uidinfo subsystem, several problems existed:
1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle.
fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already.
2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races.
fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure.
3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity.
Reviewed by: green
|
60938 |
26-May-2000 |
jake |
Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen.
Requested by: msmith and others
|
60833 |
23-May-2000 |
jake |
Change the way that the queue(3) structures are declared; don't assume that the type argument to *_HEAD and *_ENTRY is a struct.
Suggested by: phk Reviewed by: phk Approved by: mdodd
|
60325 |
10-May-2000 |
bde |
Regenerated (to fix "created from" lines, and to fix the previous regeneration which somehow used the wrong syscalls.master file, resulting in unbuildable svr4_sysent.c).
|
60324 |
10-May-2000 |
bde |
Fixed the "created from" lines generated from this file. makesyscalls.sh expects the active id to be on the first line of the specification file.
Fixed some nearby gratuitous differences with kern/syscalls.master.
|
60290 |
09-May-2000 |
bde |
Regenerated (fixed the calculation of sy_nargs in sysent tables).
|
60289 |
09-May-2000 |
bde |
Don't forget to back up svr4_syscallnames.c. Don't depend on side effects to generate it.
|
60271 |
09-May-2000 |
bde |
Fixed the return type and args struct tag for exit(). They were wrong in all emulators. These entries were unused, so the bug had no effect, but the the args struct tag will be used to calculate sy_nargs correctly.
|
60060 |
06-May-2000 |
green |
Give the "streams" modulea version (1) and depend on it from the "svr4elf" module. This unbreaks the SVR4 KLD (which had an undefined function because of thenewly-committed KLD enhancements).
|
59794 |
30-Apr-2000 |
phk |
Remove unneeded #include <vm/vm_zone.h>
Generated by: src/tools/tools/kerninclude
|
59760 |
29-Apr-2000 |
phk |
Remove unneeded #include <sys/kernel.h>
|
59391 |
19-Apr-2000 |
phk |
Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>
|
59368 |
18-Apr-2000 |
phk |
Remove unneeded <sys/buf.h> includes.
Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks by 924 bytes.
|
59342 |
18-Apr-2000 |
obrien |
Change our ELF binary branding to something more acceptable to the Binutils maintainers.
After we established our branding method of writing upto 8 characters of the OS name into the ELF header in the padding; the Binutils maintainers and/or SCO (as USL) decided that instead the ELF header should grow two new fields -- EI_OSABI and EI_ABIVERSION. Each of these are an 8-bit unsigned integer. SCO has assigned official values for the EI_OSABI field. In addition to this, the Binutils maintainers and NetBSD decided that a better ELF branding method was to include ABI information in a ".note" ELF section.
With this set of changes, we will now create ELF binaries branded using both "official" methods. Due to the complexity of adding a section to a binary, binaries branded with ``brandelf'' will only brand using the EI_OSABI method. Also due to the complexity of pulling a section out of an ELF file vs. poking around in the ELF header, our image activator only looks at the EI_OSABI header field.
Note that a new kernel can still properly load old binaries except for Linux static binaries branded in our old method.
* * For a short period of time, ``ld'' will also brand ELF binaries * using our old method. This is so people can still use kernel.old * with a new world. This support will be removed before 5.0-RELEASE, * and may not last anywhere upto the actual release. My expiration * time for this is about 6mo. *
|
56046 |
15-Jan-2000 |
newton |
Fix handling of svr4_sigsets, which are implemented in SysVR4 as a sequence of 4 longs used as a bitmask. sv4r4_sigfillset has been broken for a while, probably since rev 1.5.
This patch fixes SVR4_NSIG (i.e.: sets it to the actual number of signals, instead of the number of bits in the mask) because some SysVR4 clients honestly seem to care about whether bits in the signal mask are set for non-existant signals.
Additionally, the svr4_sigfillset macro has been replaced by a fully fledged function, because the macro didn't actually work (it returned an all-ones mask, but we don't want that: we want 0's set where FreeBSD doesn't actually have a signal which is the same as an SysVR4 signal, for example).
SysVR4 clients can now successfully ignore signals, although catching them remains problematic (see commit log message for rev1.13 of sys/i386/svr4/svr4_machdep.c for more info).
|
56045 |
15-Jan-2000 |
newton |
Remove some all-too-wordy debugging prints
|
55657 |
09-Jan-2000 |
bde |
Removed bogus include of opt_global.h. opt_global.h is automatically included in all C files if it makes sense (i.e., for compiling kernels but not for compiling modules), so including it explicitly just complicates module makefiles.
|
55355 |
03-Jan-2000 |
newton |
Need to #include vm_zone.h to pick up inline definition of zfree() so that NDFREE() macro from namei.h will be happy.
|
54655 |
15-Dec-1999 |
eivind |
Introduce NDFREE (and remove VOP_ABORTOP)
|
54494 |
12-Dec-1999 |
newton |
Replace the svr4_sys_getdents64() routine with a port of linux_getdents() -- differences between the VFS interface between FreeBSD and NetBSD make it easier to pick up the Linux one than to continue development with the NetBSD port.
This patch fixes a bug which caused duplicate filenames to be seen by callers to svr4_sys_getdents64(), leading to malformed directory listings from Solaris client programs.
Obtained from: The Linuxulator, with a pointer from marcel
|
54493 |
12-Dec-1999 |
newton |
Avoid excessive redundancy in svr4_sys_getmsg() and svr4_sys_putmsg(): Only look up the provided descriptor in fd_ofiles[] once.
Submitted by: Ville-Pertti Keinone <will@iki.fi>
|
54492 |
12-Dec-1999 |
newton |
fd_revoke() shouldn't panic if the descriptor provided is not a file or socket. Return EINVAL instead.
Submitted by: Ville-Pertti Keinone <will@iki.fi>
|
54305 |
08-Dec-1999 |
newton |
Remove unnecessary includes
Prodded by: phk
|
54300 |
08-Dec-1999 |
newton |
SVR4 emulator source files now take their compilation options from opt_global.h and opt_svr4.h, instead of from the command line. This brings them in-line with most of the rest of the kernel.
svr4_ioctl.c has also failed to compile with debugging for a while now; fixed by adding systm.h and socketvar.
Some svr4 source files are automatically generated from syscalls.master; these have been committed as consequential changes, otherwise everyone will have to "make svr4_sysent.c".
Changes:
sys/svr4/svr4.h include opt_global.h and opt_svr4.h sys/svr4/svr4_ioctl.c include svr4.h, sys/systm.h and sys/socketvar.h sys/svr4/svr4_ipc.c include svr4.h sys/svr4/svr4_resource.c include svr4.h sys/svr4/svr4_socket.c include svr4.h sys/svr4/svr4_ttold.c include svr4.h sys/svr4/syscalls.master include svr4.h sys/svr4/svr4_syscallnames.c dependent on syscalls.master sys/svr4/svr4_sysent.c dependent on syscalls.master sys/svr4/svr4_syscall.h dependent on syscalls.master sys/svr4/svr4_proto.h dependent on syscalls.master sys/modules/svr4/Makefile create opt_global.h and opt_svr4.h
|
53678 |
24-Nov-1999 |
phk |
General clean-up of socket.h and associated sources to synchronise up with NetBSD and the Single Unix Specification v2.
This updates some structures with other, almost equivalent types and effort is under way to get the whole more consistent.
Also removes a double definition of INET6 and some other clean-ups.
Reviewed by: green, bde, phk Some part obtained from: NetBSD, SUSv2 specification
|
53503 |
21-Nov-1999 |
phk |
s/p_cred->pc_ucred/p_ucred/g
|
52635 |
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
52334 |
17-Oct-1999 |
newton |
Remove unnecessary includes.
phk's script walked through .c and .h files, but some of the ones on the list are actually derived from sys/svr4/syscalls.master. Make the necessary changes here and the others will implicitly follow...
Submitted by: phk
|
52333 |
17-Oct-1999 |
newton |
Remove unnecessary includes.
Submitted by: phk
|
52140 |
11-Oct-1999 |
luoqi |
Add a per-signal flag to mark handlers registered with osigaction, so we can provide the correct context to each signal handler.
Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK as we did before the linuxthreads support merge (submitted by bde).
Move ps_sigstk from to p_sigacts to the main proc structure since signal stack should not be shared among threads.
Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag. Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag.
Reviewed by: marcel, jdp, bde
|
51957 |
05-Oct-1999 |
n_hibma |
Removal of sys/device.h
- Move intrhook stuff into kernel.h - Remove all occurrences of #device <device.h> - Add kernel.h were necessary (nowhere) - delete device.h
This file contained the structures for cfdata (old style config) and is no longer used. It was included by most drivers.
It confuses the remote debugger as the definition of 'struct device' in device.h is found before the one in bus_private.h.
|
51844 |
01-Oct-1999 |
peter |
Oops. That'll teach me to commit without testing. I either replaced one trigraph with another, or completely missed the point. Kill it for real this time.
|
51843 |
01-Oct-1999 |
peter |
Zap a trigraph (???)
|
51793 |
29-Sep-1999 |
marcel |
sigset_t change (part 4 of 5) -----------------------------
The compatibility code and/or emulators have been updated:
iBCS2 now mostly uses the older syscalls. SVR4 now properly handles all signals. This has been achieved by using the new sigset_t throughout the emulator. The Linuxulator has been severely updated. Internally the new Linux sigset_t is made the default. These are then mapped to and from the new FreeBSD sigset_t.
Also, rt_sigsuspend has been implemented in the Linuxulator. Implementing this syscall basicly caused all this sigset_t changing in the first place and the syscall has been used throughout the change as a means for testing. It basicly is too much work to undo the implementation so that it can later be added again.
A special note on the use of sv_sigtbl and sv_sigsize in struct sysentvec: Every signal larger than sv_sigsize is not translated and is passed on to the signal handler unmodified. Signals in the range 1 upto and including sv_sigsize are translated. The rationale is that only the system defined signals need to be translated.
The emulators also have been updated so that the translation tables are only indexed for valid (system defined) signals. This change also fixes the translation bug already in the SVR4 emulator.
|
51418 |
19-Sep-1999 |
green |
This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well.
Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :)
Reviewed by: peter
|
50718 |
01-Sep-1999 |
newton |
Add MAINTAINER line
|
50477 |
28-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
50405 |
26-Aug-1999 |
phk |
Simplify the handling of VCHR and VBLK vnodes using the new dev_t:
Make the alias list a SLIST.
Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore.
Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively.
Make the revoke syscalls use vcount() instead of VALIASED.
Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag.
vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one.
Print the devicename in specfs/vprint().
Remove a couple of stale LFS vnode flags.
Remove unimplemented/unused LK_DRAINED;
|
49770 |
14-Aug-1999 |
newton |
Avoid possible panic by checking for EFAULT from copyinstr() during pathname translation.
Submitted by: green
|
49280 |
30-Jul-1999 |
newton |
Previous commit also removed some 'const' qualifiers on args for svr4_sys_sendto() which probably shouldn't have been 'const'.
|
49279 |
30-Jul-1999 |
newton |
Previous commit also finished cleaning up some dev_t -> udev_t transformations related to the commit for rev 1.3 of svr4_stat.c. svr4_sysvec.c also received a copyright message (which is why it grew by 28 lines).
|
49278 |
30-Jul-1999 |
newton |
Fix svr4_sys_poll(); SysV STREAMS produce return values from poll() which BSD sockets don't. Guess at a correct emulation for those values (it seems to work for telnet, ftp and friends)
|
49267 |
30-Jul-1999 |
newton |
Add $Id$ tags
|
49264 |
30-Jul-1999 |
newton |
Fix panic caused when *stat64() family of syscalls try to fill-in their svr4_stat64 structures with old dev_t values instead of udev_t's. Panic was caused when major() and minor() were called with args which weren't pointers. The panic was probably introduced in rev 1.51 of kern_conf.c
|
48620 |
06-Jul-1999 |
cracauer |
Rename struct members sa_siginfo. POSIX reserves identifiers starting with sa_ when <signal.h> is included. They would conflict with the upcoming SA_SIGINFO implementation.
Reviewed by: BDE
|
48503 |
03-Jul-1999 |
green |
sys/buf.h needs to have included sys/systm.h for spl prototypes.
|
46889 |
10-May-1999 |
peter |
Ack! I deleted "struct", not "const".. Oh boy...
Submitted by: jkh
|
46803 |
09-May-1999 |
peter |
Fix a couple of warnings and some bitrot in comments.
|
46112 |
27-Apr-1999 |
phk |
Suser() simplification:
1: s/suser/suser_xxx/
2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with later.
There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
|
45739 |
17-Apr-1999 |
peter |
Well folks, this is it - The second stage of the removal for build support for LKM's..
|
45738 |
17-Apr-1999 |
peter |
Image activators use EXEC_SET(), not TEXT_SET(). (The first is a macro wrapper for DECLARE_MODULE(), the second is a linker set declaration)
|
43597 |
04-Feb-1999 |
newton |
svr4 emulator will refuse to unload itself if it is currently in use.
|
43499 |
01-Feb-1999 |
newton |
Acquiesce to proc.h for declarations of M_ZOMBIE, M_SUBPROC (and reorder includes so proc.h knows the right type for 'em).
Suggested by: bde
|
43424 |
30-Jan-1999 |
newton |
Oops - Ripped out a bit of debugging code which will stop certain bits of networking from working for people without DEC Tulip ethernet cards.
|
43412 |
30-Jan-1999 |
newton |
Emulator KLD for SysVR4 executables grabbed from NetBSD. See http://www.freebsd.org/~newton/freebsd-svr4 for limitations, capabilities, history and TO-DO list.
|