History log of /freebsd-11-stable/sys/kern/kern_procctl.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 352125 10-Sep-2019 kib

MFC r351773:
Add procctl(PROC_STACKGAP_CTL).

PR: 239894


# 333162 02-May-2018 kib

MFC r332740:
Add PROC_PDEATHSIG_SET to procctl interface.

MFC r332825:
Rename PROC_PDEATHSIG_SET -> PROC_PDEATHSIG_CTL.

MFC r333067:
Remove redundant pipe from pdeathsig.c test.


# 326396 30-Nov-2017 kib

MFC r326122:
Kill all descendants of the reaper, even if they are descendants of a
subordinate reaper. Also, mark reapers when listing pids.

PR: 223745


# 313302 05-Feb-2017 jilles

MFC r310096: reaper: Make REAPER_KILL_SUBTREE actually work.


# 306400 28-Sep-2016 kib

MFC r306260:
Add the foundation copyrights to procctl kernel sources.


# 306398 28-Sep-2016 kib

MFC r306081:
Add PROC_TRAPCAP procctl(2) controls and global sysctl kern.trap_enocap.


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 286975 20-Aug-2015 kib

If process becomes reaper (procctl(PROC_REAP_ACQUIRE)) while already
having some children, the children' reaper is not reset to the parent.
This allows for the situation where reaper has children but not
descendands and the too strict asserts in the reap_status() fire.

Remove the wrong asserts, add some clarification for the situation to
the procctl(2) REAP_STATUS.

Reported and tested by: feld
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 278795 15-Feb-2015 kib

Reparenting done by debugger attach can leave reaper without direct
children. Handle the situation instead asserting that it is
impossible.

Reported and tested by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 3 days


# 278794 15-Feb-2015 kib

Return with the process locked, caller expects p still locked after
the call.

Reported and tested by: bapt
Sponsored by: The FreeBSD Foundation
MFC after: 3 days


# 277322 18-Jan-2015 kib

Add procctl(2) PROC_TRACE_CTL command to enable or disable debugger
attachment to the process. Note that the command is not intended to
be a security measure, rather it is an obfuscation feature,
implemented for parity with other operating systems.

Discussed with: jilles, rwatson
Man page fixes by: rwatson
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 275821 16-Dec-2014 kib

Add missed break.

CID: 1258587
Sponsored by: The FreeBSD Foundation
MFC after: 20 days


# 275800 15-Dec-2014 kib

Add a facility for non-init process to declare itself the reaper of
the orphaned descendants. Base of the API is modelled after the same
feature from the DragonFlyBSD.

Requested by: bapt
Reviewed by: jilles (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks


# 273452 22-Oct-2014 mjg

Plug unnecessary PRS_NEW check in kern_procctl.

pfind does not return processes in such state.


# 272449 02-Oct-2014 jhb

Require p_cansched() for changing a process' protection status via
procctl() rather than p_cansee().

Submitted by: rwatson
MFC after: 3 days


# 269656 07-Aug-2014 kib

Correct the problems with the ptrace(2) making the debuggee an orphan.
One problem is inferior(9) looping due to the process tree becoming a
graph instead of tree if the parent is traced by child. Another issue
is due to the use of p_oppid to restore the original parent/child
relationship, because real parent could already exited and its pid
reused (noted by mjg).

Add the function proc_realparent(9), which calculates the parent for
given process. It uses the flag P_TREE_FIRST_ORPHAN to detect the head
element of the p_orphan list and than stepping back to its container
to find the parent process. If the parent has already exited, the
init(8) is returned.

Move the P_ORPHAN and the new helper flag from the p_flag* to new
p_treeflag field of struct proc, which is protected by proctree lock
instead of proc lock, since the orphans relationship is managed under
the proctree_lock already.

The remaining uses of p_oppid in ptrace(PT_DETACH) and process
reapping are replaced by proc_realparent(9).

Phabric: D417
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 255708 19-Sep-2013 jhb

Extend the support for exempting processes from being killed when swap is
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
from arbitrary processes. Similar to ktrace it can apply a change to all
existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
control operations on processes (as opposed to the debugger-specific
operations provided by ptrace(2)). procctl(2) uses a combination of
idtype_t and an id to identify the set of processes on which to operate
similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
of a set of processes. MADV_PROTECT still works for backwards
compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
the first bit of which is used to track if P_PROTECT should be inherited
by new child processes.

Reviewed by: kib, jilles (earlier version)
Approved by: re (delphij)
MFC after: 1 month


# 253953 05-Aug-2013 attilio

Revert r253939:
We cannot busy a page before doing pagefaults.
Infact, it can deadlock against vnode lock, as it tries to vget().
Other functions, right now, have an opposite lock ordering, like
vm_object_sync(), which acquires the vnode lock first and then
sleeps on the busy mechanism.

Before this patch is reinserted we need to break this ordering.

Sponsored by: EMC / Isilon storage division
Reported by: kib


# 253939 04-Aug-2013 attilio

The page hold mechanism is fast but it has couple of fallouts:
- It does not let pages respect the LRU policy
- It bloats the active/inactive queues of few pages

Try to avoid it as much as possible with the long-term target to
completely remove it.
Use the soft-busy mechanism to protect page content accesses during
short-term operations (like uiomove_fromphys()).

After this change only vm_fault_quick_hold_pages() is still using the
hold mechanism for page content access.
There is an additional complexity there as the quick path cannot
immediately access the page object to busy the page and the slow path
cannot however busy more than one page a time (to avoid deadlocks).

Fixing such primitive can bring to complete removal of the page hold
mechanism.

Sponsored by: EMC / Isilon storage division
Discussed with: alc
Reviewed by: jeff
Tested by: pho


# 249277 08-Apr-2013 attilio

Switch some "low-hanging fruit" to acquire read lock on vmobjects
rather than write locks.

Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Tested by: pho


# 248084 09-Mar-2013 attilio

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


# 247297 25-Feb-2013 attilio

Merge from vmobj-rwlock branch:
Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h.

Sponsored by: EMC / Isilon storage division
Tested by: pho
Reviewed by: alc


# 246484 07-Feb-2013 kib

When vforked child is traced, the debugging events are not generated
until child performs exec(). The behaviour is reasonable when a
debugger is the real parent, because the parent is stopped until
exec(), and sending a debugging event to the debugger would deadlock
both parent and child.

On the other hand, when debugger is not the parent of the vforked
child, not sending debugging signals makes it impossible to debug
across vfork.

Fix the issue by declining generating debug signals only when vfork()
was done and child called ptrace(PT_TRACEME). Set a new process flag
P_PPTRACE from the attach code for PT_TRACEME, if P_PPWAIT flag is
set, which indicates that the process was created with vfork() and
still did not execed. Check P_PPTRACE from issignal(), instead of
refusing the trace outright for the P_PPWAIT case. The scope of
P_PPTRACE is exactly contained in the scope of P_PPWAIT.

Found and tested by: zont
Reviewed by: pluknet
MFC after: 2 weeks


# 241896 22-Oct-2012 kib

Remove the support for using non-mpsafe filesystem modules.

In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by: attilio
Tested by: pho


# 239135 07-Aug-2012 kib

Always initialize pl_event.

Submitted by: Andrey Zonov <andrey@zonov.org>
MFC after: 3 days


# 238287 09-Jul-2012 davidxu

If you have pressed CTRL+Z and a process is suspended, then you use gdb
to attach to the process, it is surprising that the process is resumed
without inputting any gdb commands, however ptrace manual said:
The tracing process will see the newly-traced process stop and may
then control it as if it had been traced all along.
But the current code does not work in this way, unless traced process
received a signal later, it will continue to run as a background task.
To fix this problem, just send signal SIGSTOP to the traced process after
we resumed it, this works like that you are attaching to a running process,
it is not perfect but better than nothing.


# 232048 23-Feb-2012 kib

Allow the parent to gather the exit status of the children reparented
to the debugger. When reparenting for debugging, keep the child in
the new orphan list of old parent. When looping over the children in
kern_wait(), iterate over both children list and orphan list to search
for the process by pid.

Submitted by: Dmitry Mikulin <dmitrym juniper.net>
MFC after: 2 weeks


# 231320 09-Feb-2012 kib

Mark the automatically attached child with PL_FLAG_CHILD in struct
lwpinfo flags, for PT_FOLLOWFORK auto-attachment.

In collaboration with: Dmitry Mikulin <dmitrym juniper net>
MFC after: 1 week


# 225617 16-Sep-2011 kmacy

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# 223212 17-Jun-2011 obrien

Add comment from CSRG rev 7.27 (1992/06/23 19:56:55; author: mckusick)


# 223088 14-Jun-2011 obrien

We should not return ECHILD when debugging a child and the parent does a
"wait4(-1, ..., WNOHANG, ...)". Instead wait(2) should behave as if the
child does not wish to report status at this time.

Reviewed by: jhb


# 217896 26-Jan-2011 dchagin

Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after: 1 month


# 217819 25-Jan-2011 kib

Allow debugger to specify that children of the traced process should be
automatically traced. Extend the ptrace(PL_LWPINFO) to report that child
just forked.

Reviewed by: davidxu, jhb
MFC after: 2 weeks


# 216604 20-Dec-2010 alc

Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race
condition in proc_rwmem() and to (2) simplify the implementation of the
cxgb driver's vm_fault_hold_user_pages(). Specifically, in proc_rwmem()
the requested read or write could fail because the targeted page could be
reclaimed between the calls to vm_fault() and vm_page_hold().

In collaboration with: kib@
MFC after: 6 weeks


# 215679 22-Nov-2010 attilio

Add the ability for GDB to printout the thread name along with other
thread specific informations.

In order to do that, and in order to avoid KBI breakage with existing
infrastructure the following semantic is implemented:
- For live programs, a new member to the PT_LWPINFO is added (pl_tdname)
- For cores, a new ELF note is added (NT_THRMISC) that can be used for
storing thread specific, miscellaneous, informations. Right now it is
just popluated with a thread name.

GDB, then, retrieves the correct informations from the corefile via the
BFD interface, as it groks the ELF notes and create appropriate
pseudo-sections.

Sponsored by: Sandvine Incorporated
Tested by: gianni
Discussed with: dim, kan, kib
MFC after: 2 weeks


# 213642 09-Oct-2010 davidxu

Create a global thread hash table to speed up thread lookup, use
rwlock to protect the table. In old code, thread lookup is done with
process lock held, to find a thread, kernel has to iterate through
process and thread list, this is quite inefficient.
With this change, test shows in extreme case performance is
dramatically improved.

Earlier patch was reviewed by: jhb, julian


# 209688 04-Jul-2010 kib

Extend ptrace(PT_LWPINFO) to report siginfo for the signal that caused
debugee stop. The change should keep the ABI. Take care of compat32.

Discussed with: davidxu, jhb
MFC after: 2 weeks


# 209390 21-Jun-2010 ed

Use ISO C99 integer types in sys/kern where possible.

There are only about 100 occurences of the BSD-specific u_int*_t
datatypes in sys/kern. The ISO C99 integer types are used here more
often.


# 208555 25-May-2010 jhb

Ignore the 'addr' argument passed to PT_STEP (it is required to be '1'
for PT_STEP which means "ignore") and PT_DETACH.

PR: kern/146167
MFC after: 1 week


# 208453 23-May-2010 kib

Reorganize syscall entry and leave handling.

Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by: jhb, marcel, marius, nwhitehorn, stas
Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
stas (mips)
MFC after: 1 month


# 207410 29-Apr-2010 kmacy

On Alan's advice, rather than do a wholesale conversion on a single
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.

Supported by: Bitgravity Inc.

Discussed with: alc, jeffr, and kib


# 205014 11-Mar-2010 nwhitehorn

Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.

Reviewed by: kib, jhb


# 203788 11-Feb-2010 marcel

Initialize pve_fsid and pve_fileid to VNOVAL.


# 203783 11-Feb-2010 marcel

o Add support for COMPAT_IA32.
o Incorporate review comments:
- Properly reference and lock the map
- Take into account that the VM map can change inbetween requests
- Add the fileid and fsid attributes

Credits: kib@
Reviewed by: kib@


# 203708 09-Feb-2010 marcel

Unbreak building kernels with COMPAT_32 enabled. The actual support
for the PT_VM_ENTRY request from 32-bit processes will follow.

Pointy hat: marcel


# 203696 09-Feb-2010 marcel

Add PT_VM_TIMESTAMP and PT_VM_ENTRY so that the tracing process can
obtain the memory map of the traced process. PT_VM_TIMESTAMP can be
used to check if the memory map changed since the last time to avoid
iterating over all the VM entries unnecesarily.

MFC after: 1 month


# 202882 23-Jan-2010 kib

For PT_TO_SCE stop that stops the ptraced process upon syscall entry,
syscall arguments are collected before ptracestop() is called. As a
consequence, debugger cannot modify syscall or its arguments.

For i386, amd64 and ia32 on amd64 MD syscall(), reread syscall number
and arguments after ptracestop(), if debugger modified anything in the
process environment. Since procfs stopeven requires number of syscall
arguments in p_xstat, this cannot be solved by moving stop/trace point
before argument fetching.

Move the code to read arguments into separate function
fetch_syscall_args() to avoid code duplication. Note that ktrace point
for modified syscall is intentionally recorded twice, once with original
arguments, and second time with the arguments set by debugger.

PT_TO_SCX stop is executed after cpu_syscall_set_retval() already.

Reported by: Ali Polatel <alip exherbo org>
Briefly discussed with: jhb
MFC after: 3 weeks


# 199819 26-Nov-2009 alc

Replace VM_PROT_OVERRIDE_WRITE by VM_PROT_COPY. VM_PROT_OVERRIDE_WRITE has
represented a write access that is allowed to override write protection.
Until now, VM_PROT_OVERRIDE_WRITE has been used to write breakpoints into
text pages. Text pages are not just write protected but they are also
copy-on-write. VM_PROT_OVERRIDE_WRITE overrides the write protection on the
text page and triggers the replication of the page so that the breakpoint
will be written to a private copy. However, here is where things become
confused. It is the debugger, not the process being debugged that requires
write access to the copied page. Nonetheless, the copied page is being
mapped into the process with write access enabled. In other words, once the
debugger sets a breakpoint within a text page, the program can write to its
private copy of that text page. Whereas prior to setting the breakpoint, a
SIGSEGV would have occurred upon a write access. VM_PROT_COPY addresses
this problem. The combination of VM_PROT_READ and VM_PROT_COPY forces the
replication of a copy-on-write page even though the access is only for read.
Moreover, the replicated page is only mapped into the process with read
access, and not write access.

Reviewed by: kib
MFC after: 4 weeks


# 198463 25-Oct-2009 alc

Update a comment to reflect the previous change.


# 198341 21-Oct-2009 marcel

o Introduce vm_sync_icache() for making the I-cache coherent with
the memory or D-cache, depending on the semantics of the platform.
vm_sync_icache() is basically a wrapper around pmap_sync_icache(),
that translates the vm_map_t argumument to pmap_t.
o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc
it replaces the pmap_page_executable() function, added to solve
the I-cache problem in uiomove_fromphys().
o In proc_rwmem() call vm_sync_icache() when writing to a page that
has execute permissions. This assures that when breakpoints are
written, the I-cache will be coherent and the process will actually
hit the breakpoint.
o This also fixes the Book-E PMAP implementation that was missing
necessary locking while trying to deal with the I-cache coherency
in pmap_enter() (read: mmu_booke_enter_locked).

The key property of this change is that the I-cache is made coherent
*after* writes have been done. Doing it in the PMAP layer when adding
or changing a mapping means that the I-cache is made coherent *before*
any writes happen. The difference is key when the I-cache prefetches.


# 195280 02-Jul-2009 rwatson

Clean up a number of aspects of token generation from audit arguments to
system calls:

- Centralize generation of argument tokens for VM addresses in a macro,
ADDR_TOKEN(), and properly encode 64-bit addresses in 64-bit arguments.
- Fix up argument numbers across a large number of syscalls so that they
match the numeric argument into the system call.
- Don't audit the address argument to ioctl(2) or ptrace(2), but do keep
generating tokens for mmap(2), minherit(2), since they relate to passing
object access across execve(2).

Approved by: re (audit argument blanket)
Obtained from: TrustedBSD Project
MFC after: 1 week


# 195104 27-Jun-2009 rwatson

Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type. This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by: brooks
Approved by: re (kib)
Obtained from: TrustedBSD Project
MFC after: 1 week


# 194766 23-Jun-2009 kib

Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with: pho
Reviewed by: alc
Approved by: re (kensmith)


# 189282 02-Mar-2009 kib

Use the p_sysent->sv_flags flag SV_ILP32 to detect 32bit process
executing on 64bit kernel. This eliminates the direct comparisions
of p_sysent with &ia32_freebsd_sysvec, that were left intact after
r185169.


# 184667 05-Nov-2008 davidxu

Revert rev 184216 and 184199, due to the way the thread_lock works,
it may cause a lockup.

Noticed by: peter, jhb


# 184199 23-Oct-2008 davidxu

Actually, for signal and thread suspension, extra process spin lock is
unnecessary, the normal process lock and thread lock are enough. The
spin lock is still needed for process and thread exiting to mimic
single sched_lock.


# 183911 15-Oct-2008 davidxu

Move per-thread userland debugging flags into seperated field,
this eliminates some problems of locking, e.g, a thread lock is needed
but can not be used at that time. Only the process lock is needed now
for new field.


# 177368 19-Mar-2008 jeff

- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from
requiring the per-process spinlock to only requiring the process lock.
- Reflect these changes in the proc.h documentation and consumers throughout
the kernel. This is a substantial reduction in locking cost for these
fields and was made possible by recent changes to threading support.


# 177091 12-Mar-2008 jeff

Remove kernel support for M:N threading.

While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential. Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.


# 173468 08-Nov-2007 ups

Use VM_FAULT_DIRTY to fault in pages for write access in
proc_rwmen.
Otherwise copy on write may create an anonymous page that is
not marked as dirty. Since writing data to these pages
in this function also does not dirty these pages they may be
later discarded by the pagedaemon.


# 172485 08-Oct-2007 jeff

- Fix from pr kern/115469; Don't redeliver a signal once it has been
handled by the target process.

Contributed by: Tijl Coosemans <tijl@ulyssis.org>
Approved by: re


# 172207 17-Sep-2007 jeff

- Move all of the PS_ flags into either p_flag or td_flags.
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
previously the sched_lock. These bugs have existed for some time.
- Allow swapout to try each thread in a process individually and then
swapin the whole process if any of these fail. This allows us to move
most scheduler related swap flags into td_flags.
- Keep ki_sflag for backwards compat but change all in source tools to
use the new and more correct location of P_INMEM.

Reported by: pho
Reviewed by: attilio, kib
Approved by: re (kensmith)


# 170307 04-Jun-2007 jeff

Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 167211 04-Mar-2007 rwatson

Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.


# 163709 26-Oct-2006 jb

Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by: davidxu@


# 163676 25-Oct-2006 davidxu

Move sigqueue_take() call into proc_reparent(), this fixed bugs where
proc_reparent() is called but sigqueue_take() is forgotten.


# 163345 14-Oct-2006 trhodes

Close a race condition where num can be larger than tmp, giving the user
too large of a boundary.

Reported by: Ilja Van Sprundel


# 161472 20-Aug-2006 cperciva

Fix a signedness bug.

MFC after: 3 days
Security: Local DoS


# 155922 22-Feb-2006 jhb

Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
stop event earlier. After we have signalled that, we set P_WEXIT and
then wait for any processes with a hold on the vmspace via PHOLD to
release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is
invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
to zero.
- Change proc_rwmem() to require that the processing read from has its
vmspace held via PHOLD by the caller and get rid of all the junk to
screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
to clear an earlier single-step simualted via a breakpoint). We only
do one to avoid races. Also, by making the EINVAL error for unknown
requests be part of the default: case in the switch, the various
switch cases can now just break out to return which removes a _lot_ of
duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug
where a LWP ptrace command could return EINVAL with the proc lock still
held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
ptrace_clear_single_step() to always be called with the proc lock
held (it was a mixed bag previously). Alpha and arm have to drop
the lock while the mess around with breakpoints, but other archs
avoid extra lock release/acquires in ptrace(). I did have to fix a
couple of other consumers in kern_kse and a few other places to
hold the proc lock and PHOLD.

Tested by: ps (1 mostly, but some bits of 2-4 as well)
MFC after: 1 week


# 155634 13-Feb-2006 wsalamon

Audit the arguments to the ptrace(2) system call.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)


# 155381 06-Feb-2006 davidxu

Add members pl_sigmask and pl_siglist into ptrace_lwpinfo to get lwp's
signal mask and pending signals.


# 153697 24-Dec-2005 davidxu

Avoid kernel panic when attaching a process which may not be stopped
by debugger, e.g process is dumping core. Only access p_xthread if
P_STOPPED_TRACE is set, this means thread is ready to exchange signal
with debugger, print a warning if P_STOPPED_TRACE is not set due to
some bugs in other code, if there is.

The patch has been tested by Anish Mistry mistry.7 at osu dot edu, and
is slightly adjusted.


# 152214 08-Nov-2005 davidxu

Make sure pending SIGCHLD is removed from previous parent when process
is attached or detached.


# 149284 19-Aug-2005 davidxu

Fix a LOR between sched_lock and sleep queue lock.


# 147692 30-Jun-2005 peter

Jumbo-commit to enhance 32 bit application support on 64 bit kernels.
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.

ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
sscanf fails. They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note
that 64 bit consumers can still debug 32 bit targets.

IA64 has got stubs for ia32_reg.c.

Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet. We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.

Approved by: re


# 143820 18-Mar-2005 das

Add missing cases for PT_SYSCALL.

Found by: Coverity Prevent analysis tool


# 139804 06-Jan-2005 imp

/* -> /*- for copyright notices, minor format tweaks as necessary


# 138129 27-Nov-2004 das

Don't include sys/user.h merely for its side-effect of recursively
including other headers.


# 133337 08-Aug-2004 davidxu

Add pl_flags to ptrace_lwpinfo, two flags PL_FLAG_SA and PL_FLAG_BOUND
indicate that a thread is in UTS critical region.

Reviewed by: deischen
Approved by: marcel


# 132684 27-Jul-2004 alc

- Use atomic ops for updating the vmspace's refcnt and exitingcnt.
- Push down Giant into shmexit(). (Giant is acquired only if the vmspace
contains shm segments.)
- Eliminate the acquisition of Giant from proc_rwmem().
- Reduce the scope of Giant in exit1(), uncovering the destruction of the
address space.


# 132322 17-Jul-2004 davidxu

This is a forced commit.
Clear suspension flag for debugged process when detaching.


# 132317 17-Jul-2004 davidxu

Fix typo.


# 132089 13-Jul-2004 davidxu

Implement following commands: PT_CLEARSTEP, PT_SETSTEP, PT_SUSPEND
PT_RESUME, PT_GETNUMLWPS, PT_GETLWPLIST.


# 132016 12-Jul-2004 marcel

Implement the PT_LWPINFO request. This request can be used by the
tracing process to obtain information about the LWP that caused the
traced process to stop. Debuggers can use this information to select
the thread currently running on the LWP as the current thread.

The request has been made compatible with NetBSD for as much as
possible. This implementation differs from NetBSD in the following
ways:
1. The data argument is allowed to be smaller than the size of the
ptrace_lwpinfo structure known to the kernel, but not 0. This
is opposite to what NetBSD allows. The reason for this is that
we can extend the structure without affecting older binaries.
2. On NetBSD the tracing process is to set the pl_lwpid field to
the Id of the LWP it wants information of. We don't do that.
Our ptrace interface allows passing the LWP Id instead of the
PID. The tracing process is to set the PID to the LWP Id it
wants information of.
3. When the PID is actually the PID of the tracing process, this
request returns the information about the LWP that caused the
process to stop. This was the whole purpose of the request in
the first place.

When the traced process has exited, this request will return the
LWP Id 0, indicating that the process state is not the result of
an event specific to a LWP.


# 131450 02-Jul-2004 davidxu

Allow ptrace to deal with lwpid.

Reviewed by: marcel


# 127730 01-Apr-2004 jhb

Finish fixing up Alpha to work with an MP safe ptrace():
- ptrace_single_step() is no longer called with the proc lock held, so
don't try to unlock it and then relock it.
- Push Giant down into proc_rwmem() instead of forcing all the consumers
(including Alpha breakpoint support) to explicitly wrap calls to
proc_rwmem() with Giant.

Tested by: kensmith


# 127386 24-Mar-2004 alc

Use uiomove_fromphys() instead of pmap_qenter() and pmap_qremove() in
proc_rwmem().


# 127034 15-Mar-2004 jhb

Drop the proc lock around calls to the MD functions ptrace_single_step(),
ptrace_set_pc(), and cpu_ptrace() so that those functions are free to
acquire Giant, sleep, etc. We already do a PHOLD/PRELE around them so
that it is safe to sleep inside of these routines if necessary. This
allows ptrace() to be marked MP safe again as it no longer triggers lock
order reversals on Alpha.

Tested by: wilko


# 125993 19-Feb-2004 truckman

When reparenting a process in the PT_DETACH code, only set p_sigparent
to SIGCHLD if the new parent process is initproc.

MFC after: 2 weeks


# 125720 11-Feb-2004 truckman

When reparenting a process to init, make sure that p_sigparent is
set to SIGCHLD. This avoids the creation of orphaned Linux-threaded
zombies that init is unable to reap. This can occur when the parent
process sets its SIGCHLD to SIG_IGN. Fix a similar situation in the
PT_DETACH code.

Tested by: "Steven Hartland" <killing AT multiplay.co.uk>


# 120937 09-Oct-2003 robert

Implement preliminary support for the PT_SYSCALL command to ptrace(2).


# 118932 15-Aug-2003 marcel

Add or finish support for machine dependent ptrace requests. When we
check for permissions, do it for all requests, not the known requests.
Later when we actually service the request we deal with the invalid
requests we previously caught earlier.

This commit changes the behaviour of the ptrace(2) interface for
boundary cases such as an unknown request without proper permissions.
Previously we would return EINVAL. Now we return EBUSY or EPERM.

Platforms need to define __HAVE_PTRACE_MACHDEP when they have MD
requests. This makes the prototype of cpu_ptrace() visible and
introduces a call to this function for all requests greater or
equal to PT_FIRSTMACH.

Silence on: audit


# 118749 10-Aug-2003 nectar

Add or correct range checking of signal numbers in system calls and
ioctls.

In the particular case of ptrace(), this commit more-or-less reverts
revision 1.53 of sys_process.c, which appears to have been erroneous.

Reviewed by: iedowse, jhb


# 118697 09-Aug-2003 alc

Background: When proc_rwmem() wired and mapped a page, it also added
a reference to the containing object. The purpose of the reference
being to prevent the destruction of the object and an attempt to free
the wired page. (Wired pages can't be freed.) Unfortunately, this
approach does not work. Some operations, like fork(2) that call
vm_object_split(), can move the wired page to a difference object,
thereby making the reference pointless and opening the possibility
of the wired page being freed.

A solution is to use vm_page_hold() in place of vm_page_wire(). Held
pages can be freed. They are moved to a special hold queue until the
hold is released.

Submitted by: tegge


# 118361 02-Aug-2003 alc

Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in proc_rwmem().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by: tegge


# 116195 11-Jun-2003 alc

Add vm object locking.


# 116182 10-Jun-2003 obrien

Use __FBSDID().


# 114028 25-Apr-2003 jhb

Push down Giant around calls to proc_rwmem() in kern_ptrace. kern_ptrace()
should now be MP safe.


# 113868 22-Apr-2003 jhb

Prefer the proc lock to sched_lock when testing PS_INMEM now that it is
safe to do so.


# 113635 17-Apr-2003 jhb

The sched_lock is not needed while clearing two of the P_STOPPED bits in
p_flag. Also, the proc lock can't be recursed, so simplify an older proc
lock assertion.


# 112389 18-Mar-2003 des

Whitespace cleanup.


# 105271 16-Oct-2002 jhb

Add a missing PROC_UNLOCK in ptrace() for the PT_IO case.

PR: kern/44065
Submitted by: Mark Kettenis <kettenis@chello.nl>


# 103216 11-Sep-2002 julian

Completely redo thread states.

Reviewed by: davidxu@freebsd.org


# 103085 07-Sep-2002 peter

Remove bogus fill_kinfo_proc() before ptrace_set_pc(). There was no need
for this.

Submitted by: bde


# 102950 05-Sep-2002 davidxu

s/SGNL/SIG/
s/SNGL/SINGLE/
s/SNGLE/SINGLE/

Fix abbreviation for P_STOPPED_* etc flags, in original code they were
inconsistent and difficult to distinguish between them.

Approved by: julian (mentor)


# 102946 04-Sep-2002 iedowse

Split up ptrace() into a wrapper that does the copying to and from
user space and a kern_ptrace() implementation. Use the kern_*()
version in the Linux emulation code to remove more stack gap uses.

Approved by: des


# 102412 25-Aug-2002 charnier

Replace various spelling with FALLTHROUGH which is lint()able


# 100418 20-Jul-2002 rwatson

Do preserve the error result from calling p_cansee() and use that when
failing because of the error.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 99882 12-Jul-2002 alc

Lock accesses to the page queues.


# 99880 12-Jul-2002 tmm

Fix ptrace(PT_READ_*, ...) for non-little-endian architectures where
sizeof(register_t) != sizeof(int).


# 99072 29-Jun-2002 julian

Part 1 of KSE-III

The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)

NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..


# 96891 18-May-2002 marcel

All signals can be sent to the inferior process when it's restarted,
not just the legacy ones.

PR: 33299
Submitted by: Alexander N. Kabaev <ak03@gte.com>


# 96886 18-May-2002 jhb

Change p_can{debug,see,sched,signal}()'s first argument to be a thread
pointer instead of a proc pointer and require the process pointed to
by the second argument to be locked. We now use the thread ucred reference
for the credential checks in p_can*() as a result. p_canfoo() should now
no longer need Giant.


# 96244 09-May-2002 mini

Remove trace_req().

Reviewed by: alfred, jhb, peter


# 95165 20-Apr-2002 marcel

GCC 3.x WARNS: Add a break to the default case.


# 94665 14-Apr-2002 alfred

Don't allow one to trace an ancestor when already traced.

PR: kern/29741
Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org>
Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au>
MFC After: 2 weeks


# 94556 12-Apr-2002 jhb

Rework ptrace(2) to be more locking friendly. We do any needed copyin()'s
and acquire the proctree_lock if needed first. Then we lock the process
if necessary and fiddle with it as appropriate. Finally we drop locks and
do any needed copyout's. This greatly simplifies the locking.


# 94307 09-Apr-2002 jhb

- Change fill_kinfo_proc() to require that the process is locked when it
is called.
- Change sysctl_out_proc() to require that the process is locked when it
is called and to drop the lock before it returns. If this proves too
complex we can change sysctl_out_proc() to simply acquire the lock at
the very end and have the calling code drop the lock right after it
returns.
- Lock the process we are going to export before the p_cansee() in the
loop in sysctl_kern_proc() and hold the lock until we call
sysctl_out_proc().
- Don't call p_cansee() on the process about to be exported twice in
the aforementioned loop.


# 92461 16-Mar-2002 jake

Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/
pmap_qremove. pmap_kenter is not safe to use in MI code because it is not
guaranteed to flush the mapping from the tlb on all cpus. If the process
in question is preempted and migrates cpus between the call to pmap_kenter
and pmap_kremove, the original cpu will be left with stale mappings in its
tlb. This is currently not a problem for i386 because we do not use PG_G on
SMP, and thus all mappings are flushed from the tlb on context switches, not
just user mappings. This is not the case on all architectures, and if PG_G
is to be used with SMP on i386 it will be a problem. This was committed by
peter earlier as part of his fine grained tlb shootdown work for i386, which
was backed out for other reasons.

Reviewed by: peter


# 92395 16-Mar-2002 des

Implement PT_IO (read / write arbitrary amounts of data or text).

Submitted by: Artur Grabowski <art@{blahonga,openbsd}.org>
Obtained from: OpenBSD


# 92369 15-Mar-2002 des

PT_[GS]ET{,DB,FP}REGS isn't really optional any more, since we have dummy
backend functions for those archs that don't support them. I meant to do
this ages ago, but never got around to it.

Inspired by: OpenBSD


# 91367 27-Feb-2002 peter

Back out all the pmap related stuff I've touched over the last few days.
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels. Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely. Userland programs still crashed.


# 91344 27-Feb-2002 peter

Jake further reduced IPI shootdowns on sparc64 in loops by using ranged
shootdowns in a couple of key places. Do the same for i386. This also
hides some physical addresses from higher levels and has it use the
generic vm_page_t's instead. This will help for PAE down the road.

Obtained from: jake (MI code, suggestions for MD part)


# 91140 23-Feb-2002 tanimura

Lock struct pgrp, session and sigio.

New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by: jhb, alfred
Tested on: cvsup.jp.FreeBSD.org
(which is a quad-CPU machine running -current)


# 91007 21-Feb-2002 bde

Fixed some style bugs. Added a comment about a bug in PT_SSTEP.

Approved by: des


# 91006 21-Feb-2002 bde

Recover bits that were lost in transition in rev.1.76:
- P_INMEM checks in all the functions. P_INMEM must be checked because
PHOLD() is broken. The old bits had bogus locking (using sched_lock)
to lock P_INMEM. After removing the P_INMEM checks, we were left with
just the bogus locking.
- large comments. They were too large, but better than nothing.

Remove obfuscations that were gained in transition in rev.1.76:
- PROC_REG_ACTION() is even more of an obfuscation than PROC_ACTION().

The change copies procfs_machdep.c rev.1.22 of i386/procfs_machdep.c
verbatim except for "fixing" the old-style function headers and adjusting
function names and comments. It doesn't remove the bogus locking.

Approved by: des


# 90391 08-Feb-2002 peter

Bah, I managed to turn cosmetic things into real bugs. Fix shadowed
variable declarations. :-( Definately not my day today.


# 90374 07-Feb-2002 peter

Fix a whole bunch of long lines introduced by previous commit by using
td = FIRST_THREAD_IN_PROC(p) once, after we have identified the process
that we are operating on.


# 90361 07-Feb-2002 julian

Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,


# 85297 21-Oct-2001 des

Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to
proc_* in the process; procfs_machdep.c is no longer needed.

Run-tested on i386, build-tested on Alpha, untested on other platforms.


# 84637 07-Oct-2001 des

Dissociate ptrace from procfs.

Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested. Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.

This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().

It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".


# 84483 04-Oct-2001 des

Final style(9) commit: placement of opening brace; a continuation indent I
missed in the previous commit; a line that exceeded 80 characters. No
functional changes, but the object file's md5 checksum changes because some
lines have been displaced.


# 84482 04-Oct-2001 des

More style(9) fixes: no spaces between function name and parameter list;
some indentation fixes (particularly continuation lines).

Reviewed by: md5(1)


# 84480 04-Oct-2001 des

This file had a mixture of "return foo;" and "return (foo);"; standardize
on "return (foo);" as mandated by style(9).

Reviewed by: md5(1)


# 83633 18-Sep-2001 mp

Set debug information on the process being traced, not the current (debugger)
process. This should allow gdb to function correctly on post-KSE kernels.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 81265 08-Aug-2001 peter

Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.

gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.


# 79335 05-Jul-2001 rwatson

o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx().
The p_can(...) construct was a premature (and, it turns out,
awkward) abstraction. The individual calls to p_canxxx() better
reflect differences between the inter-process authorization checks,
such as differing checks based on the type of signal. This has
a side effect of improving code readability.
o Replace direct credential authorization checks in ktrace() with
invocation of p_candebug(), while maintaining the special case
check of KTR_ROOT. This allows ktrace() to "play more nicely"
with new mandatory access control schemes, as well as making its
authorization checks consistent with other "debugging class"
checks.
o Eliminate "privused" construct for p_can*() calls which allowed the
caller to determine if privilege was required for successful
evaluation of the access control check. This primitive is currently
unused, and as such, serves only to complicate the API.

Approved by: ({procfs,linprocfs} changes) des
Obtained from: TrustedBSD Project


# 77031 23-May-2001 ru

- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file
systems were repo-copied from sys/miscfs to sys/fs.

- Renamed the following file systems and their modules:
fdesc -> fdescfs, portal -> portalfs, union -> unionfs.

- Renamed corresponding kernel options:
FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.

- Install header files for the above file systems.

- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
Makefiles.


# 76274 04-May-2001 jhb

Fix a bug in the pfind() changes due to confusing the process returned by
pfind() ('pp') with the process being detached from ptrace.

Reported by: bde


# 76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 75893 23-Apr-2001 jhb

Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by: -smp, dfr, jake


# 74927 28-Mar-2001 jhb

Convert the allproc and proctree locks from lockmgr locks to sx locks.


# 73917 07-Mar-2001 jhb

- Proc locking.
- Remove some unneeded spl()'s.


# 72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 71567 24-Jan-2001 jhb

- Catch up to proc flag changes.
- Update stopevent() to assert that the proc lock is held when it is
held and is not recursed. Note that the STOPEVENT() macro obtains
the proc lock when calling this function.


# 70530 30-Dec-2000 ps

Backout rev 1.57 & 1.58. While the previous revisions fixed
attaching to running processes, it completely breaks normal debugging.
A better fix is in the works, but cannot be properly tested until
the problem with gdb hanging the system in -current is solved.


# 70503 29-Dec-2000 ps

Pass me the pointy hat. Do not hold sched_lock over psignal.

Submitted by: alfred


# 70418 28-Dec-2000 ps

Send a SIGCONT when detaching or continuing the excution of a traced
process. This fixes a problem when attaching to a process in gdb
and the process staying in the STOP'd state after quiting gdb.
This whole process seems a bit suspect, but this seems to work.

Reviewed by: peter


# 70317 23-Dec-2000 jake

Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by: jhb, -smp@


# 69896 12-Dec-2000 mckusick

Change the proc information returned from the kernel so that it
no longer contains kernel specific data structures, but rather
only scalar values and structures that are already part of the
kernel/user interface, specifically rusage and rtprio. It no
longer contains proc, session, pcred, ucred, procsig, vmspace,
pstats, mtx, sigiolst, klist, callout, pasleep, or mdproc. If
any of these changed in size, ps, w, fstat, gcore, systat, and
top would all stop working. The new structure has over 200 bytes
of unassigned space for future values to be added, yet is nearly
100 bytes smaller per entry than the structure that it replaced.


# 69506 01-Dec-2000 jhb

Protect p_stat with sched_lock.


# 67107 14-Oct-2000 jwd

Remove the signal value check from the PT_STEP codepath. It
can cause an bogus failure.

Reviewed by: Sean Eric Fagan <sef@kithrup.com>
and no other response to the review request.


# 65237 30-Aug-2000 rwatson

o Centralize inter-process access control, introducing:

int p_can(p1, p2, operation, privused)

which allows specification of subject process, object process,
inter-process operation, and an optional call-by-reference privused
flag, allowing the caller to determine if privilege was required
for the call to succeed. This allows jail, kern.ps_showallprocs and
regular credential-based interaction checks to occur in one block of
code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL,
and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a
series of static function checks in kern_prot, which should not
be invoked directly.

o Commented out capabilities entries are included for some checks.

o Update most inter-process authorization to make use of p_can() instead
of manual checks, PRISON_CHECK(), P_TRESPASS(), and
kern.ps_showallprocs.

o Modify suser{,_xxx} to use const arguments, as it no longer modifies
process flags due to the disabling of ASU.

o Modify some checks/errors in procfs so that ENOENT is returned instead
of ESRCH, further improving concealment of processes that should not
be visible to other processes. Also introduce new access checks to
improve hiding of processes for procfs_lookup(), procfs_getattr(),
procfs_readdir(). Correct a bug reported by bp concerning not
handling the CREATE case in procfs_lookup(). Remove volatile flag in
procfs that caused apparently spurious qualifier warnigns (approved by
bde).

o Add comment noting that ktrace() has not been updated, as its access
control checks are different from ptrace(), whereas they should
probably be the same. Further discussion should happen on this topic.

Reviewed by: bde, green, phk, freebsd-security, others
Approved by: bde
Obtained from: TrustedBSD Project


# 53518 21-Nov-1999 phk

Introduce the new function
p_trespass(struct proc *p1, struct proc *p2)
which returns zero or an errno depending on the legality of p1 trespassing
on p2.

Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one
extra signal related check.

Replace procfs.h:CHECKIO() macros with calls to p_trespass().

Only show command lines to process which can trespass on the target
process.


# 52635 29-Oct-1999 phk

useracc() the prequel:

Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.


# 52128 11-Oct-1999 peter

Trim unused options (or #ifdef for undoc options).

Submitted by: phk


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 48691 09-Jul-1999 jlemon

Implement support for hardware debug registers on the i386.

Submitted by: Brian Dean <brdean@unx.sas.com>


# 48430 01-Jul-1999 peter

Moving the initialization for write sooner quiets a warning.


# 46155 28-Apr-1999 phk

This Implements the mumbled about "Jail" feature.

This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

I have no scripts for setting up a jail, don't ask me for them.

The IP number should be an alias on one of the interfaces.

mount a /proc in each jail, it will make ps more useable.

/proc/<pid>/status tells the hostname of the prison for
jailed processes.

Quotas are only sensible if you have a mountpoint per prison.

There are no privisions for stopping resource-hogging.

Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/


# 46112 27-Apr-1999 phk

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 45105 29-Mar-1999 dfr

Call ptrace_u_check with the right size.


# 43301 27-Jan-1999 dillon

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# 42067 26-Dec-1998 dfr

Tweak ptrace(PT_READ_U) so that the last alpha register can be read.


# 37958 29-Jul-1998 dfr

Only access an int for READU/WRITEU since that is what ptrace is declared to
return.


# 37655 15-Jul-1998 bde

Cast function pointers to uintfptr_t before casting them to u_long.
Hopefully caddr_t is large enough to hold function pointers.

Cast object pointers to uintptr_t before casting them to u_long.

Types are wronger than usual for the PT_READ_U case. ptrace() can
only return ints, but longs are accessed.


# 36735 07-Jun-1998 dfr

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 36168 18-May-1998 tegge

Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by: David Greenman <dg@root.com>


# 33134 06-Feb-1998 eivind

Back out DIAGNOSTIC changes.


# 33108 04-Feb-1998 eivind

Turn DIAGNOSTIC into a new-style option.


# 32702 22-Jan-1998 dyson

VM level code cleanups.

1) Start using TSM.
Struct procs continue to point to upages structure, after being freed.
Struct vmspace continues to point to pte object and kva space for kstack.
u_map is now superfluous.
2) vm_map's don't need to be reference counted. They always exist either
in the kernel or in a vmspace. The vmspaces are managed by reference
counts.
3) Remove the "wired" vm_map nonsense.
4) No need to keep a cache of kernel stack kva's.
5) Get rid of strange looking ++var, and change to var++.
6) Change more data structures to use our "zone" allocator. Added
struct proc, struct vmspace and struct vnode. This saves a significant
amount of kva space and physical memory. Additionally, this enables
TSM for the zone managed memory.
7) Keep ioopt disabled for now.
8) Remove the now bogus "single use" map concept.
9) Use generation counts or id's for data structures residing in TSM, where
it allows us to avoid unneeded restart overhead during traversals, where
blocking might occur.
10) Account better for memory deficits, so the pageout daemon will be able
to make enough memory available (experimental.)
11) Fix some vnode locking problems. (From Tor, I think.)
12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
(experimental.)
13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c
code. Use generation counts, get rid of unneded collpase operations,
and clean up the cluster code.
14) Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step. Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)


# 31564 06-Dec-1997 sef

Changes to allow event-based process monitoring and control.


# 31138 12-Nov-1997 tegge

Set return value for the correct process in ptrace().


# 30994 06-Nov-1997 phk

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


# 29041 02-Sep-1997 bde

Removed unused #includes.


# 25206 27-Apr-1997 alex

Remove bogon from previous commit: doubly included sys/systm.h.


# 25200 27-Apr-1997 alex

Prevent debugger attachment to init when securelevel > 0.

Noticed by: Brian Buchanan <brian@wasteland.calbbs.com>


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 16066 02-Jun-1996 dyson

Remove the now-unnecessary and incorrect wiring of the "other" processes
page table pages. The pmap layer now handles that fully.


# 15543 02-May-1996 phk

removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
NPDEPG

Major macro cleanup.


# 14925 30-Mar-1996 peter

Because of the way that ptrace() now calls procfs routines to read/write
the process's memory, it was possible for the procfs_domem() call to
return a residual leftover, but with no errno. Since this is no good for
ptrace which ignored the the residual, remap a leftover amount into an
errno rather than fooling the caller into thinking it was successful when
in fact it was not.

Submitted by: bde (a very long time ago :-)


# 13607 24-Jan-1996 peter

Major fixes for ptrace()...

PT_ATTACH/PT_DETACH implemented now and fully operational.
PT_{GET|SET}{REGS|FPREFS} implemented now, using code shared with procfs
PT_{READ|WRITE}_{I|D} now uses code shared with procfs
ptrace opcodes now fully permission checked, including ownerships.
doing an operation to the u-area on a swapped process should no longer
panic.
running gdb as root works for me now, where it didn't before.
general cleanup..

Note, that this has some tightening of permissions/access checks etc.
Some of these may be going too far.. In particular, the "owner" of the
traced process is enforced. The process that created or attached to
the traced process is now the only one that can "do" things to it.


# 13490 19-Jan-1996 dyson

Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
overhead for merged cache.
Efficiency improvement for vfs_cluster. It used to do alot of redundant
calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files. Additionally,
fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources. The pageout code
will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
buffers.


# 12903 17-Dec-1995 bde

Updated to match 1TB filesize changes. Some pindexes were still offsets
and weren't converted. ptrace() was broken.


# 12899 16-Dec-1995 bde

Removed dead debugging code.


# 12662 07-Dec-1995 dg

Untangled the vm.h include file spaghetti.


# 12278 14-Nov-1995 phk

Move the process-table stuff to a more suitable file.
Remove filetable stuff from kern_sysctl.c


# 12221 12-Nov-1995 bde

Included <sys/sysproto.h> to get central declarations for syscall args
structs and prototypes for syscalls.

Ifdefed duplicated decentralized declarations of args structs. It's
convenient to have this visible but they are hard to maintain. Some
are already different from the central declarations. 4.4lite2 puts
them in comments in the function headers but I wanted to avoid the
large changes for that.


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8485 12-May-1995 dg

pread/pwrite() should be static.

Submitted by: sef


# 7090 16-Mar-1995 bde

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 6552 19-Feb-1995 dg

Truncate the pte address to a page boundry. This probably won't fix the
panic, but at least it's more correct.


# 6474 15-Feb-1995 dg

Fixed botched previous change - use 'pageno' not initialized to NULL 'kva'.

Submitted by: Lars Fredriksen


# 6307 10-Feb-1995 dg

Wire the page table before doing the vm_fault(). Fixes a panic that
happens when using gdb.

Submitted by: John Dyson


# 5603 14-Jan-1995 bde

Fix security holes in sigreturn(), ptrace() and procfs. sigreturn()
attempted to check for insecure and fatal eflags and segment
selectors, but missed many cases and got the IOPL check back to
front. The other syscalls didn't check at all.

sys_process.c, machdep.c:
Only allow PT_WRITE_U to write to the registers (ordinary and FP).

psl.h, locore.s, machdep.c:
Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR. We are not supposed
to assume anything about the reserved bits. Use PSL_USERCHANGE
and PSL_KERNEL instead. Rename PSL_USERSET to PSL_USER.

exception.s:
Define a private label for use by doreti when returning to user
mode fails.

machdep.c:
In syscalls, allow changing only the eflags that can be changed on
486's in user mode (no longer attempt to allow benign IOPL changes;
allow changing the nasty PSL_NT; don't allow changing the i586
bits).

Don't attempt to check all the cases involving invalid selectors
and %eip's. Just check for privilege violations and let the invalid
things cause a trap.

procfs_machdep.c:
Call the ptrace register functions to do all the work for reading
and writing ordinary registers and for single stepping.

trap.c:
Ignore traps caused by PSL_NT being set. Previously, users could
cause a fatal trap in user mode by setting PSL_NT and executing an
iret, and a fatal trap in kernel mode by setting PSL_NT and making
a syscall. PSL_NT was cleared too late and not in enough modes to
fix the problem.

Make all traps in user mode (except T_NMI) nonfatal.

Recover from traps caused by attempting to load invalid user
registers in doreti by restarting the traps so that they appear to
occur in user mode.
---

Fix bogons that I noticed while fixing the above:

psl.h:
Fix some comments.

Uniformize idempotency ifdef.

exception.s, machdep.c:
Remove rsvd[0-14]. rsvd0 hasn't been reserved since the 486 came
out. Replace rsvd0 by `align'. rsvd[0-11] used wrong (magic
non-unique) trap numbers. Replace rsvd[1-14] by rsvd.

locore.s:
Enable alignment check flag on 486's and 586's.

machdep.c:
Use a better type for kstack[].

Use TFREGP() to find the registers.

Reformat ptrace functions from SEF to something closer to KNF.

procfs_machdep.c:
The wrong pointer to the registers got fixed as a side effect.

Implement reading and writing of FP registers.

/proc/*/*regs now work (only) for processes that are in memory.

Clean up comments.

trap.c, trap.h:
Remove unused trap types.


# 3098 25-Sep-1994 phk

While in the real world, I had a bad case of being swapped out for a lot of
cycles. While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent). So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there. Having a lap-top is
highly recommended. My kernel still runs, yell at me if you kernel breaks.


# 2112 18-Aug-1994 wollman

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 2056 13-Aug-1994 wollman

Change all #includes to follow the current Berkeley style. Some of these
``changes'' are actually not changes at all, but CVS sometimes has trouble
telling the difference.

This also includes support for second-directory compiles. This is not
quite complete yet, as `config' doesn't yet do the right thing. You can
still make it work trivially, however, by doing the following:

rm /sys/compile
mkdir /usr/obj/sys/compile
ln -s M-. /sys/compile
cd /sys/i386/conf
config MYKERNEL
cd ../../compile/MYKERNEL
ln -s /sys @
rm machine
ln -s @/i386/include machine
make depend
make


# 1945 08-Aug-1994 dg

Process tracing code. Written by Sean Eric Fagan.

Submitted by: Sean Eric Fagan


# 1817 02-Aug-1994 dg

Added $Id$


# 1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# 1542 24-May-1994 rgrimes

This commit was generated by cvs2svn to compensate for changes in r1541,
which included commits to RCS files with non-trunk default branches.


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources