History log of /freebsd-11-stable/sys/kern/kern_linker.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 359652 06-Apr-2020 hselasky

MFC r333806:
Use NULL for SYSINIT's last arg, which is a pointer type

Sponsored by: The FreeBSD Foundation


# 355418 05-Dec-2019 hselasky

MFC r355108 and r355170:
Fix panic when loading kernel modules before root file system is mounted.
Make sure the rootvnode is always NULL checked.

Differential Revision: https://reviews.freebsd.org/D22545
PR: 241639
Sponsored by: Mellanox Technologies


# 340052 02-Nov-2018 bz

MFC r339407:

The countp argument passed to linker_file_lookup_set() in
linker_load_dependencies() is unused, so no need to ask for the
value in first place. Remove the unused "count" variable.


# 325866 15-Nov-2017 gordon

MFC r325865

Properly bzero kldstat structure to prevent kernel information leak.

Security: FreeBSD-SA-17:10.kldstat
Security: CVE-2017-1088


# 324748 19-Oct-2017 avg

MFC r324311: sysctl-s in a module should be accessible only when the module is initialized

Sponsored by: Panzura


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 301751 09-Jun-2016 cem

Add DDB command "kldstat"

It prints much the same information as kldstat(8) without any arguments.

Suggested by: jhibbits
Sponsored by: EMC / Isilon Storage Division


# 298819 29-Apr-2016 pfg

sys/kern: spelling fixes in comments.

No functional change.


# 298433 21-Apr-2016 pfg

sys: use our roundup2/rounddown2() macros when param.h is available.

rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.


# 297391 29-Mar-2016 trasz

Remove some NULL checks for M_WAITOK allocations.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 292077 11-Dec-2015 imp

Create the MDT_PNP_INFO metadata record to communicate PNP info about
modules. External agents may use this data to automatically load those
modules.

Differential Review: https://reviews.freebsd.org/D3461


# 290728 12-Nov-2015 jhb

Export various helper variables describing the layout and size of
certain kernel structures for use by debuggers. This mostly aids
in examining cores from a kernel without debug symbols as a debugger
can infer these values if debug symbols are available.

One set of variables describes the layout of 'struct linker_file' to
walk the list of loaded kernel modules.

A second set of variables describes the layout of 'struct proc' and
'struct thread' to walk the list of processes in the kernel and the
threads in each process.

The 'pcb_size' variable is used to index into the stoppcbs[] array.

The 'vm_maxuser_address' is used to distinguish kernel virtual addresses
from user addresses. This doesn't have to be perfect, and
'vm_maxuser_address' is a cheap and simple way to differentiate kernel
pointers from simple values like TIDs and PIDs.

While here, annotate the fields in struct pcb used by kgdb on amd64
and i386 to note that their ABI should be preserved. Annotations for
other platforms will be added in the future.

Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3773


# 287835 15-Sep-2015 mjg

sysctl: switch sysctllock to a sleepable rmlock, take 2

This restores r285125. Previous attempt was reverted due to a bug in rmlocks,
which is fixed since r287833.


# 286094 30-Jul-2015 mjg

Revert r285125 until rmlocks get fixed.

Right now there is a chance that sysctl unregister will cause reader to
block on the sx lock associated with sysctl rmlock, in which case kernels
with debug enabled will panic.


# 285125 04-Jul-2015 mjg

sysctl: switch sysctllock to a sleepable rmlock

The lock is almost never taken for writing.


# 284160 08-Jun-2015 jhb

Revert r284153, as I believe it breaks the dtrace sdt module. I will
fix the original issue a different way.


# 284153 08-Jun-2015 jhb

Add an internal "locked" variant of linker_file_lookup_set() and change
the public function to acquire the global linker lock directly. This
permits linker_file_lookup_set() to be safely used from other modules.


# 275469 03-Dec-2014 imp

Const poison in a few places to ensure we don't modify things
through the module data pointer.


# 275261 29-Nov-2014 imp

The current limit of 100k for the linker hints file is getting a bit
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.

MFC After: 2 weeks


# 273431 21-Oct-2014 mjg

Take the lock shared in linker_search_symbol_name.

This helps sysctl kern.proc.stack.


# 273400 21-Oct-2014 mjg

Rename sysctl_lock and _unlock to sysctl_xlock and _xunlock.


# 273334 20-Oct-2014 marcel

Fully support constructors for the purpose of code coverage analysis.
This involves:
1. Have the loader pass the start and size of the .ctors section to the
kernel in 2 new metadata elements.
2. Have the linker backends look for and record the start and size of
the .ctors section in dynamically loaded modules.
3. Have the linker backends call the constructors as part of the final
work of initializing preloaded or dynamically loaded modules.

Note that LLVM appends the priority of the constructors to the name of
the .ctors section. Not so when compiling with GCC. The code currently
works for GCC and not for LLVM.

Submitted by: Dmitry Mikulin <dmitrym@juniper.net>
Obtained from: Juniper Networks, Inc.


# 267992 28-Jun-2014 hselasky

Pull in r267961 and r267973 again. Fix for issues reported will follow.


# 267985 27-Jun-2014 gjb

Revert r267961, r267973:

These changes prevent sysctl(8) from returning proper output,
such as:

1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory


# 267961 27-Jun-2014 hselasky

Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after: 2 weeks
Sponsored by: Mellanox Technologies


# 264173 05-Apr-2014 kib

Use realloc(9) instead of doing the reallocation inline.

Submitted by: bde
MFC after: 1 week


# 263080 12-Mar-2014 kib

Use correct types for sizeof() in the calculations for the malloc(9) sizes [1].
While there, remove unneeded checks for failed allocations with M_WAITOK flag.

Submitted by: Conrad Meyer <cemeyer@uw.edu> [1]
MFC after: 1 week


# 259587 19-Dec-2013 markj

Invoke the kld_* event handlers from linker_load_file() and
linker_unload_file() rather than kern_kldload() and kern_kldunload(). This
ensures that the handlers are invoked for files that are loaded/unloaded
automatically as dependencies. Previously, they were only invoked for files
loaded by a user.

As a side effect, the kld_load and kld_unload handlers are now invoked with
the kernel linker lock exclusively held.

Reported by: avg
Reviewed by: jhb
MFC after: 2 weeks


# 254813 24-Aug-2013 markj

Rename the kld_unload event handler to kld_unload_try, and add a new
kld_unload event handler which gets invoked after a linker file has been
successfully unloaded. The kld_unload and kld_load event handlers are now
invoked with the shared linker lock held, while kld_unload_try is invoked
with the lock exclusively held.

Convert hwpmc(4) to use these event handlers instead of having
kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are
loaded or unloaded. This has no functional effect, but simplifes the linker
code somewhat.

Reviewed by: jhb


# 254811 24-Aug-2013 markj

Set things up so that linker_file_lookup_set() is always called with the
linker lock held. This makes it possible to call it from a kld event handler
with the shared lock held.

Reviewed by: jhb


# 254810 24-Aug-2013 markj

Remove the kld lock macros and just use the sx(9) API. Add locking in
linker_init_kernel_modules() and linker_preload() in order to remove most
of the checks for !cold before asserting that the kld lock is held. These
routines are invoked by SYSINIT(9), so there's no harm in them taking the
kld lock.


# 254396 16-Aug-2013 markj

Use strdup(9) instead of reimplementing it.


# 254309 13-Aug-2013 markj

Use kld_{load,unload} instead of mod_{load,unload} for the linker file load
and unload event handlers added in r254266.

Reported by: jhb
X-MFC with: r254266


# 254268 13-Aug-2013 markj

FreeBSD's DTrace implementation has a few problems with respect to handling
probes declared in a kernel module when that module is unloaded. In
particular,

* Unloading a module with active SDT probes will cause a panic. [1]
* A module's (FBT/SDT) probes aren't destroyed when the module is unloaded;
trying to use them after the fact will generally cause a panic.

This change fixes both problems by porting the DTrace module load/unload
handlers from illumos and registering them with the corresponding
EVENTHANDLER(9) handlers. This allows the DTrace framework to destroy all
probes defined in a module when that module is unloaded, and to prevent a
module unload from proceeding if some of its probes are active. The latter
problem has already been fixed for FBT probes by checking lf->nenabled in
kern_kldunload(), but moving the check into the DTrace framework generalizes
it to all kernel providers and also fixes a race in the current
implementation (since a probe may be activated between the check and the
call to linker_file_unload()).

Additionally, the SDT implementation has been reworked to define SDT
providers/probes/argtypes in linker sets rather than using SYSINIT/SYSUNINIT
to create and destroy SDT probes when a module is loaded or unloaded. This
simplifies things quite a bit since it means that pretty much all of the SDT
code can live in sdt.ko, and since it becomes easier to integrate SDT with
the DTrace framework. Furthermore, this allows FreeBSD to be quite flexible
in that SDT providers spanning multiple modules can be created on the fly
when a module is loaded; at the moment it looks like illumos' SDT
implementation requires all SDT probes to be statically defined in a single
kernel table.

PR: 166927, 166926, 166928
Reported by: davide [1]
Reviewed by: avg, trociny (earlier version)
MFC after: 1 month


# 254267 13-Aug-2013 markj

Remove some unused fields from struct linker_file. They were added in
r172862 for use by the DTrace SDT framework but don't seem to have ever
been used.

MFC after: 2 weeks


# 254266 13-Aug-2013 markj

Add event handlers for module load and unload events. The load handlers are
called after the module has been loaded, and the unload handlers are called
before the module is unloaded. Moreover, the module unload handlers may
return an error to prevent the unload from proceeding.

Reviewed by: avg
MFC after: 2 weeks


# 241896 22-Oct-2012 kib

Remove the support for using non-mpsafe filesystem modules.

In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by: attilio
Tested by: pho


# 234186 12-Apr-2012 jhb

If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.

Reported by: gcooper
MFC after: 1 week


# 233295 22-Mar-2012 ae

Correct debug message.


# 233276 21-Mar-2012 ae

Acquire modules lock before call module_getname() in the KLD_DEBUG case.

MFC after: 1 week


# 232999 15-Mar-2012 ae

Add CTLFLAG_TUN to the sysctl definition and fix style.

Pointed by: Garrett Cooper
MFC after: 2 weeks


# 232997 15-Mar-2012 ae

Add debug.kld_debug loader tunable.

MFC after: 2 weeks


# 231949 20-Feb-2012 kib

Fix found places where uio_resid is truncated to int.

Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with: bde, das (previous versions)
MFC after: 1 month


# 231931 20-Feb-2012 delphij

Revert r231923 for now. Further work is needed to make sure that the
behavior is consistent.


# 231923 19-Feb-2012 delphij

Use uprintf instead of printf for the reason why a kernel module can not
be loaded. This way, the administrator can get response immediately from
the shell session rather than relying on dmesg.

MFC after: 1 month


# 229272 02-Jan-2012 ed

Use strchr() and strrchr().

It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.

For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.


# 227151 06-Nov-2011 fjoe

Add KLD_DEBUG option.


# 225617 16-Sep-2011 kmacy

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# 224546 31-Jul-2011 glebius

Don't leak kld_sx lock in kldunloadf().

Approved by: re (kib)


# 224156 17-Jul-2011 rstone

Fix a LOR between hwpmc and the kernel linker. When a system-wide
sampling mode PMC is allocated, hwpmc calls linker_hwpmc_list_objects()
while already holding an exclusive lock on pmc-sx lock. list_objects()
tries to acquire an exclusive lock on the kld_sx lock. When a KLD module
is loaded or unloaded successfully, kern_kld(un)load calls into the pmc
hook while already holding an exclusive lock on the kld_sx lock. Calling
the pmc hook requires acquiring a shared lock on the pmc-sx lock.

Fix this by only acquiring a shared lock on the kld_sx lock in
linker_hwpmc_list_objects(), and also downgrading to a shared lock on the
kld_sx lock in kern_kld(un)load before calling into the pmc hook. In
kern_kldload this required moving some modifications of the linker_file_t
to happen before calling into the pmc hook.

This fixes the deadlock by ensuring that the hwpmc -> list_objects() case
is always able to proceed. Without this patch, I was able to deadlock a
multicore system within minutes by constantly loading and unloading an KLD
module while I simultaneously started a sampling mode PMC in a loop.

MFC after: 1 month


# 220158 30-Mar-2011 kib

Provide compat32 shims for kldstat(2).

Requested and tested by: jpaetzel
MFC after: 1 week


# 217555 18-Jan-2011 mdf

Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.


# 216988 05-Jan-2011 trasz

Fix page fault that occurred when trying to initialize preloaded kernel module,
the dependency of which was preloaded, but failed to initialize. Previously,
kernel dereferenced NULL pointer returned by modlist_lookup2(); now, when this
happens, we unload the dependent module. Since the depended_files list is
sorted in dependency order, this properly propagates, unloading modules that
depend on failed ones.

From the user point of view, this prevents the kernel from panicing when
trying to boot kernel compiled without KDTRACE_HOOKS with dtraceall_load="YES"
in /boot/loader.conf.

Reviewed by: kib


# 212994 22-Sep-2010 avg

kdb_backtrace: use stack_print_ddb instead of stack_print

This is a followup to r212964.
stack_print call chain obtains linker sx lock and thus potentially may
lead to a deadlock depending on a kind of a panic.
stack_print_ddb doesn't acquire any locks and it doesn't use any
facilities of ddb backend.
Using stack_print_ddb outside of DDB ifdef required taking a number of
helper functions from under it as well.

It is a good idea to rename linker_ddb_* and stack_*_ddb functions to
have 'unlocked' component in their name instead of 'ddb', because those
functions do not use any DDB services, but instead they provide unlocked
access to linker symbol information. The latter was previously needed
only for DDB, hence the 'ddb' name component.

Alternative is to ditch unlocked versions altogether after implementing
proper panic handling:
1. stop other cpus upon a panic
2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op
when panicstr != NULL

Suggested by: mdf
Discussed with: attilio
MFC after: 2 weeks


# 199457 17-Nov-2009 gonzo

- Unbreak build with KLD_DEBUG defined
- Add debug.kld_debug sysctl to control KLD debugging level
- Print information about KLD dependencies with debug enabled


# 196970 08-Sep-2009 phk

Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.


# 196969 08-Sep-2009 phk

Add necessary include.


# 196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# 195803 21-Jul-2009 rpaulo

Improve the printf message when a module failed to load. This gives the
user some clue about the possibility of a __FreeBSD_version mismatch.

Discussed with: rwatson, jhb
Approved by: re (kib)


# 195741 17-Jul-2009 jamie

Remove the interim vimage containers, struct vimage and struct procg,
and the ioctl-based interface that supported them.

Approved by: re (kib), bz (mentor)


# 195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 195159 29-Jun-2009 attilio

Don't assume a default (currently 15) value for preloaded klds when
loading hwpmc, but calculate at runtime and allocate the necessary space.
Also the current logic is wrong as it can lead to an endless loop.

Sponsored by: Sandvine Incorporated
Reported by: Ryan Stone <rstone at sandvine dot com>
Tested by: Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
Approved by: re (kib)


# 193511 05-Jun-2009 rwatson

Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with: pjd


# 192895 27-May-2009 jamie

Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails. Child jails may be restricted more than their parents,
but never less. Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system. Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings. The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by: bz (mentor)


# 191917 08-May-2009 zec

A NOP change: style / whitespace cleanup of the noise that slipped
into r191816.

Spotted by: bz
Approved by: julian (mentor) (an earlier version of the diff)


# 191915 08-May-2009 zec

Introduce a new virtualization container, provisionally named vprocg, to hold
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.

As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.

Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage
branch.

Permit kldload / kldunload operations to be executed only from the default
vimage context.

This change should have no functional impact on nooptions VIMAGE kernel
builds.

Reviewed by: bz
Approved by: julian (mentor)


# 191816 05-May-2009 zec

Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one. The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE(). Recursions
on curvnet are permitted, though strongly discuouraged.

This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.

The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.

The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.

This change also introduces a DDB subcommand to show the list of all
vnet instances.

Approved by: julian (mentor)


# 188440 10-Feb-2009 attilio

Scanning all the formats for binary translation of modules loading can
result in errors for a format loading but subsequent correct recognizing
for another format.

File format loading functions should avoid printing any additional
informations but just returning appropriate (and different between each
other) error condition, characterizing different informations.
Additively, the linker should handle appropriately different format
loading errors.

While a general mechanism is desired, fix a simple and common case on
amd64: file type is not recognized for link elf and confuses the linker.
Printout an error if all the registered linker classes can't recognize
and load the module.

Reviewed by: jhb
Sponsored by: Sandvine Incorporated


# 188232 06-Feb-2009 jhb

Expand the scope of the sysctllock sx lock to protect the sysctl tree itself.
Back in 1.1 of kern_sysctl.c the sysctl() routine wired the "old" userland
buffer for most sysctls (everything except kern.vnode.*). I think to prevent
issues with wiring too much memory it used a 'memlock' to serialize all
sysctl(2) invocations, meaning that only one user buffer could be wired at
a time. In 5.0 the 'memlock' was converted to an sx lock and renamed to
'sysctl lock'. However, it still only served the purpose of serializing
sysctls to avoid wiring too much memory and didn't actually protect the
sysctl tree as its name suggested. These changes expand the lock to actually
protect the tree.

Later on in 5.0, sysctl was changed to not wire buffers for requests by
default (sysctl_handle_opaque() will still wire buffers larger than a single
page, however). As a result, user buffers are no longer wired as often.
However, many sysctl handlers still wire user buffers, so it is still
desirable to serialize userland sysctl requests. Kernel sysctl requests
are allowed to run in parallel, however.

- Expose sysctl_lock()/sysctl_unlock() routines to exclusively lock the
sysctl tree for a few places outside of kern_sysctl.c that manipulate
the sysctl tree directly including the kernel linker and vfs_register().
- sysctl_register() and sysctl_unregister() require the caller to lock
the sysctl lock using sysctl_lock() and sysctl_unlock(). The rest of
the public sysctl API manage the locking internally.
- Add a locked variant of sysctl_remove_oid() for internal use so that
external uses of the API do not need to be aware of locking requirements.
- The kernel linker no longer needs Giant when manipulating the sysctl
tree.
- Add a missing break to the loop in vfs_register() so that we stop looking
at the sysctl MIB once we have changed it.

MFC after: 1 month


# 188209 05-Feb-2009 jhb

Drop the kernel linker lock while running SYSUNINIT routines and removing
sysctls during a linker file unload. We drop the lock when doing similar
operations during a linker file load. To close races, clear the LINKED
flag before dropping the lock so that the linker file is no longer visible
to userland.

MFC after: 1 week


# 185895 10-Dec-2008 zec

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 185635 05-Dec-2008 jhb

- Invoke MOD_QUIESCE on all modules in a linker file (kld) before
unloading any modules. As a result, if any module veto's an unload
request via MOD_QUIESCE, the entire set of modules for that linker
file will remain loaded and active now rather than leaving the kld
in a weird state where some modules are loaded and some are unloaded.
- This also moves the logic for handling the "forced" unload flag out of
kern_module.c and into kern_linker.c which is a bit cleaner.
- Add a module_name() routine that returns the name of a module and use that
instead of printing pointer values in debug messages when a module fails
MOD_QUIESCE or MOD_UNLOAD.

MFC after: 1 month


# 184214 23-Oct-2008 des

Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.


# 184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# 182371 28-Aug-2008 attilio

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 179238 23-May-2008 jb

Add the ctf_get function and update the args to linker_file_function_listall.


# 178380 21-Apr-2008 pjd

Back-out previous revision. For now I can use _ddb() variants of stack(9) KPI,
as I use it for debugging only. Once someone will need it for more production
features, the change should be reconsider.

Requested by: rwatson


# 178284 17-Apr-2008 pjd

Allow linker_search_symbol_name() to be called with KLD lock held.
The linker_search_symbol_name() function is used by stack_print()
and stack_print() can be called from kernel module unload method.

MFC after: 1 week


# 177253 16-Mar-2008 rwatson

In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation. This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after: 1 month
Discussed with: imp, rink


# 175294 13-Jan-2008 attilio

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


# 174132 01-Dec-2007 rwatson

The kernel linker includes a number of utility functions to look up symbol
information in support of DDB(4); these functions bypass normal linker
locking as they may run in contexts where locking is unsafe (such as the
kernel debugger).

Add a new interface linker_ddb_search_symbol_name(), which looks up a
symbol name and offset given an address, and also
linker_search_symbol_name() which does the same but *does* follow the
locking conventions of the linker.

Unlike existing functions, these functions place the name in a
caller-provided buffer, which is stable even after linker locks have been
released. These functions will be used in upcoming revisions to stack(9)
to support kernel stack trace generation in contexts as part of a live,
rather than suspended, kernel.


# 173714 17-Nov-2007 jb

Add a function to list symbols in a file and their values at the
same time rather than having to list the symbols and then go back
and look each one up by name.


# 172930 24-Oct-2007 rwatson

Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

mac_<object>_<method/action>
mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer


# 172862 22-Oct-2007 jb

Add the full module path name to the kld_file_stat structure
for kldstat(2).

This allows libdtrace to determine the exact file from which
a kernel module was loaded without having to guess.

The kldstat(2) API is versioned with the size of the
kld_file_stat structure, so this change creates version 2.

Add the pathname to the verbose output of kldstat(8) too.

MFC: 3 days


# 170152 31-May-2007 kib

Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)


# 168951 22-Apr-2007 rwatson

Remove MAC Framework access control check entry points made redundant with
the introduction of priv(9) and MAC Framework entry points for privilege
checking/granting. These entry points exactly aligned with privileges and
provided no additional security context:

- mac_check_sysarch_ioperm()
- mac_check_kld_unload()
- mac_check_settime()
- mac_check_system_nfsd()

Add mpo_priv_check() implementations to Biba and LOMAC policies, which,
for each privilege, determine if they can be granted to processes
considered unprivileged by those two policies. These mostly, but not
entirely, align with the set of privileges granted in jails.

Obtained from: TrustedBSD Project


# 167211 04-Mar-2007 rwatson

Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.


# 167019 26-Feb-2007 jhb

Fix a comment.


# 166921 23-Feb-2007 jhb

Drop the global kernel linker lock while executing the sysinit's for a
freshly-loaded kernel module. To avoid various unload races, hide linker
files whose sysinit's are being run from userland so that they can't be
kldunloaded until after all the sysinit's have finished.

Tested by: gallatin


# 164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 163606 22-Oct-2006 rwatson

Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA


# 160245 10-Jul-2006 jhb

Explicitly use STAILQ_REMOVE_HEAD() when we know we are removing the head
element to avoid confusing Coverity. It's now also easier for humans to
parse as well.

Found by: Coverity Prevent(tm)
CID: 1201


# 160244 10-Jul-2006 jhb

Fix two more instances of using a linker_file_t object in TAILQ() macros
after free'ing it.

Found by: Coverity Prevent(tm)
CID: 1435


# 160242 10-Jul-2006 jhb

Don't try to reuse the linker_file structure after we've freed it when
throwing out the kld's loaded by the loader that didn't successfully link.

Found by: Coverity Prevent(tm)
CID: 1435


# 160142 06-Jul-2006 jhb

- Explicitly acquire Giant around SYSINIT's and SYSUNINIT's since they are
not all known to be MPSAFE yet.
- Actually remove Giant from the kernel linker by taking it out of the
KLD_LOCK() and KLD_UNLOCK() macros.

Pointy hat to: jhb (2)


# 159845 21-Jun-2006 jhb

Replace the kld_mtx mutex with a kld_sx sx lock and expand it's scope to
protect all linker-related data structures including the contents of
linker file objects and the any linker class data as well. Considering how
rarely the linker is used I just went with the simple solution of
single-threading the whole thing rather than expending a lot of effor on
something more fine-grained and complex. Giant is still explicitly
acquired while registering and deregistering sysctl's as well as in the
elf linker class while calling kmupetext(). The rest of the linker runs
without Giant unless it has to acquire Giant while loading files from a
non-MPSAFE filesystem.


# 159843 21-Jun-2006 jhb

- Push down Giant in kldfind() and kldsym().
- Remove several goto's by either using direct return's or else clauses.


# 159841 21-Jun-2006 jhb

Fix two comments and a style fix.


# 159840 21-Jun-2006 jhb

Various whitespace fixes.


# 159808 20-Jun-2006 jhb

Conditionally acquire Giant around VFS operations.


# 159804 20-Jun-2006 jhb

- Push Giant down into linker_reference_module().
- Add a new function linker_release_module() as a more intuitive complement
to linker_reference_module() that wraps linker_file_unload().
linker_release_module() can either take the module name and version info
passed to linker_reference_module() or it can accept the linker file
object returned by linker_reference_module().


# 159800 20-Jun-2006 jhb

Make linker_find_file_by_name() and linker_find_file_by_id() static to
simplify linker locking. The only external consumers now use
linker_file_foreach().


# 159797 20-Jun-2006 jhb

- Add a new linker_file_foreach() function that walks the list of linker
file objects calling a user-specified predicate function on each object.
The iteration terminates either when the entire list has been iterated
over or the predicate function returns a non-zero value.
linker_file_foreach() returns the value returned by the last invocation
of the predicate function. It also accepts a void * context pointer that
is passed to the predicate function as well. Using an iterator function
avoids exposing linker internals to the rest of the kernel making locking
simpler.
- Use linker_file_foreach() instead of walking the list of linker files
manually to lookup ndis files in ndis(4).
- Use linker_file_foreach() to implement linker_hwpmc_list_objects().


# 159796 20-Jun-2006 jhb

Make linker_file_add_dependency() and linker_load_module() static since
only the linker uses them.


# 159794 20-Jun-2006 jhb

Don't check if malloc(M_WAITOK) returns NULL.


# 159792 20-Jun-2006 jhb

Use 'else' to remove another goto.


# 159791 20-Jun-2006 jhb

- Remove some useless variable initializations.
- Make some conditional free()'s where the condition was always true
unconditional.


# 159596 14-Jun-2006 marcel

Unbreak 64-bit architectures. The 3rd argument to kern_kldload() is
a pointer to an integer and td->td_retval[0] is of type register_t.
On 64-bit architectures register_t is wider than an integer.


# 159588 13-Jun-2006 jhb

- Add a kern_kldload() that is most of the previous kldload() and push
Giant down in it.
- Push Giant down in kern_kldunload() and reorganize it slightly to avoid
using gotos. Also, expose this function to the rest of the kernel.


# 159587 13-Jun-2006 jhb

- Push down Giant some in kldstat().
- Use a 'struct kld_file_stat' on the stack to read data under the lock
and then do one copyout() w/o holding the lock at the end to push the
data out to userland.


# 159586 13-Jun-2006 jhb

Unexpand TAILQ_FOREACH() and TAILQ_FOREACH_SAFE().


# 159585 13-Jun-2006 jhb

Remove some more pointless goto's and don't check to see if
malloc(M_WAITOK) returns NULL.


# 159584 13-Jun-2006 jhb

Handle the simple case of just dropping a reference near the start of
linker_file_unload() instead of in the middle of a bunch of code for
the case of dropping the last reference to improve readability and sanity.
While I'm here, remove pointless goto's that were just jumping to a
return statement.


# 158972 27-May-2006 delphij

extlen and cpp is not used here in linker_search_kld(), so nuke them.

Reported by: Mingyan Guo <guomingyan at gmail dot com>
MFC After: 2 weeks


# 157144 26-Mar-2006 jkoshy

MFP4: Support for profiling dynamically loaded objects.

Kernel changes:

Inform hwpmc of executable objects brought into the system by
kldload() and mmap(), and of their removal by kldunload() and
munmap(). A helper function linker_hwpmc_list_objects() has been
added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve
the list of currently loaded kernel modules.

The unused `MAPPINGCHANGE' event has been deprecated in favour
of separate `MAP_IN' and `MAP_OUT' events; this change reduces
space wastage in the log.

Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to
handle the map change callbacks.

Change the default per-cpu sample buffer size to hold
32 samples (up from 16).

Increment __FreeBSD_version.

libpmc(3) changes:

Update libpmc(3) to deal with the new events in the log file; bring
the pmclog(3) manual page in sync with the code.

pmcstat(8) changes:

Introduce new options to pmcstat(8): "-r" (root fs path), "-M"
(mapfile name), "-q"/"-v" (verbosity control). Option "-k" now
takes a kernel directory as its argument but will also work with
the older invocation syntax.

Rework string handling in pmcstat(8) to use an opaque type for
interned strings. Clean up ELF parsing code and add support for
tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).

Report statistics at the end of a log conversion run depending
on the requested verbosity level.

Reviewed by: jhb, dds (kernel parts of an earlier patch)
Tested by: gallatin (earlier patch)


# 151484 19-Oct-2005 jdp

Fix a bug in the kernel module runtime linker that made it impossible
to unload the usb.ko module after boot if it was originally preloaded
from "/boot/loader.conf". When processing preloaded modules, the
linker erroneously added self-dependencies the each module's reference
count. That prevented usb.ko's reference count from ever going to 0,
so it could not be unloaded.

Sponsored by Isilon Systems.

Reviewed by: pjd, peter
MFC after: 1 week


# 146733 28-May-2005 pjd

Fix panic when module is compiled in and it is loaded from loader.conf.
Only panic is fixed, module will be still listed in kldstat(8) output.
Not sure what is correct fix, because adding unloading code in case of
failure to linker_init_kernel_modules() doesn't work.


# 146730 28-May-2005 pjd

Prevent loading modules with are compiled into the kernel.

PR: kern/48759
Submitted by: Pawe³ Ma³achowski <pawmal@unia.3lo.lublin.pl>
Patch from: demon
MFC after: 2 weeks


# 144443 31-Mar-2005 jhb

- Denote a few places where kobj class references are manipulated without
holding the appropriate lock.
- Add a comment explaining why we bump a driver's kobj class reference
when loading a module.


# 134364 26-Aug-2004 iedowse

When trying each linker class in turn with a preloaded module, exit
the loop if the preload was successful. Previously a successful
preload was ignored if the linker class was not the last in the
list.


# 132117 13-Jul-2004 phk

Give kldunload a -f(orce) argument.

Add a MOD_QUIESCE event for modules. This should return error (EBUSY)
of the module is in use.

MOD_UNLOAD should now only fail if it is impossible (as opposed to
inconvenient) to unload the module. Valid reasons are memory references
into the module which cannot be tracked down and eliminated.

When kldunloading, we abandon if MOD_UNLOAD fails, and if -force is
not given, MOD_QUIESCE failing will also prevent the unload.

For backwards compatibility, we treat EOPNOTSUPP from MOD_QUIESCE as
success.

Document that modules should return EOPNOTSUPP for unknown events.


# 131398 01-Jul-2004 jhb

Trim a few things from the dmesg output and stick them under bootverbose to
cut down on the clutter including PCI interrupt routing, MTRR, pcibios,
etc.

Discussed with: USENIX Cabal


# 129364 17-May-2004 peter

Since we go to the trouble of compiling the kobj ops table for each class,
and cannot handle it going away, add an explicit reference to the kobj
class inside each linker class. Without this, a class with no modules
loaded will sit with an idle refcount of 0. Loading and unloading
a module with it causes a 0->1->0 transition which frees the ops table
and causes subsequent loads using that class to explode. Normally, the
"kernel" module will remain forever loaded and prevent this happening, but
if you have more than one linker class active, only one owns the "kernel".

This finishes making modules work for kldload(8) on amd64.


# 128057 09-Apr-2004 peadar

Plug minor memory leak of module_t structures when unloading a file
from the kernel.

Reviewed By: Doug Rabson (dfr@)


# 126253 25-Feb-2004 truckman

Split the mlock() kernel code into two parts, mlock(), which unpacks
the syscall arguments and does the suser() permission check, and
kern_mlock(), which does the resource limit checking and calls
vm_map_wire(). Split munlock() in a similar way.

Enable the RLIMIT_MEMLOCK checking code in kern_mlock().

Replace calls to vslock() and vsunlock() in the sysctl code with
calls to kern_mlock() and kern_munlock() so that the sysctl code
will obey the wired memory limits.

Nuke the vslock() and vsunlock() implementations, which are no
longer used.

Add a member to struct sysctl_req to track the amount of memory
that is wired to handle the request.

Modify sysctl_wire_old_buffer() to return an error if its call to
kern_mlock() fails. Only wire the minimum of the length specified
in the sysctl request and the length specified in its argument list.
It is recommended that sysctl handlers that use sysctl_wire_old_buffer()
should specify reasonable estimates for the amount of data they
want to return so that only the minimum amount of memory is wired
no matter what length has been specified by the request.

Modify the callers of sysctl_wire_old_buffer() to look for the
error return.

Modify sysctl_old_user to obey the wired buffer length and clean up
its implementation.

Reviewed by: bms


# 120382 23-Sep-2003 fjoe

Avoid NULL pointer dereferencing in modlist_lookup2().

PR: 56570
Submitted by: Thomas Wintergerst <Thomas.Wintergerst@nord-com.net>


# 118094 27-Jul-2003 phk

Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.


# 116182 10-Jun-2003 obrien

Use __FBSDID().


# 111852 03-Mar-2003 ru

FreeBSD 5.0 has stopped shipping /modules 2.5 years ago. Catch
up with this further by excluding /modules from the (default)
kern.module_path.


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 109605 21-Jan-2003 jake

Resolve relative relocations in klds before trying to parse the module's
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.

Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64


# 107855 14-Dec-2002 alfred

unwrap lines made short enough by SCARGS removal


# 107849 13-Dec-2002 alfred

SCARGS removal take II.


# 107839 13-Dec-2002 alfred

Backout removal SCARGS, the code freeze is only "selectively" over.


# 107838 13-Dec-2002 alfred

Remove SCARGS.

Reviewed by: md5


# 107089 19-Nov-2002 rwatson

Merge kld access control checks from the MAC tree: these access control
checks permit policy modules to augment the system policy for permitting
kld operations. This permits policies to limit access to kld operations
based on credential (and other) properties, as well as to perform checks
on the kld being loaded (integrity, etc).

Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 105337 17-Oct-2002 sam

fix kldload error return when a module is rejected because it's statically
linked in the kernel. When this condition is detected deep in the linker
internals the EEXIST error code that's returned is stomped on and instead
an ENOEXEC code is returned. This makes apps like sysinstall bitch.


# 105167 15-Oct-2002 phk

Plug a memory-leak.

"I think you're right" by: jake


# 101941 15-Aug-2002 rwatson

In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
"cred", and change the semantics of consumers of fo_read() and
fo_write() to pass the active credential of the thread requesting
an operation rather than the cached file cred. The cached file
cred is still available in fo_read() and fo_write() consumers
via fp->f_cred. These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
pipe_read/write() now authorize MAC using active_cred rather
than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred. Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not. If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 101241 02-Aug-2002 mux

Make the consumers of the linker_load_file() function use
linker_load_module() instead.

This fixes a bug where the kernel was unable to properly locate and
load a kernel module in vfs_mount() (and probably in the netgraph
code as well since it was using the same function). This is because
the linker_load_file() does not properly search the module path.

Problem found by: peter
Reviewed by: peter
Thanks to: peter


# 100488 22-Jul-2002 truckman

Pre-wire the output buffer so that sysctl_kern_function_list() doesn't
block in SYSCTL_OUT() while holding a lock.


# 99553 07-Jul-2002 jeff

- Delay unlocking a vnode in linker_hints_lookup until we're actually done
with it.
- Remove a now stale comment about improper vnode locking.


# 98452 19-Jun-2002 arr

- Remove the lock(9) protecting the kernel linker system.
- Added a mutex, kld_mtx, to protect the kernel_linker system. Note that
while ``classes'' is global (to that file), it is only read only after
SI_SUB_KLD, SI_ORDER_ANY.
- Add a SYSINIT to flip a flag that disallows class registration after
SI_SUB_KLD, SI_ORDER_ANY.

Idea for ``classes'' read only by: jake
Reviewed by: jake


# 95488 26-Apr-2002 brian

Test if rootvnode is NULL rather than if rootdev is NODEV when determining
if there's a filesystem present.

rootdev can be NODEV in the NFS-mounted root scenario.

Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse


# 94322 09-Apr-2002 brian

In linker_load_module(), check that rootdev != NODEV before calling
linker_search_module().

Without this, modules loaded from loader.conf that then try to load
in additional modules (such as digi.ko loading a card's BIOS) die
badly in the vn_open() called from linker_search_module().

It may be worth checking (KASSERTing?) that rootdev != NODEV in
vn_open() too.


# 94321 09-Apr-2002 brian

Change linker_reference_module() so that it's passed a struct
mod_depend * (which may be NULL). The only consumer of this
function at the moment is digi_loadmoduledata(), and that passes
a NULL mod_depend *.

In linker_reference_module(), check to see if we've already got
the required module loaded. If we have, bump the reference count
and return that, otherwise continue the module search as normal.


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 93159 25-Mar-2002 arr

- Recommit the securelevel_gt() calls removed by commits rev. 1.84 of
kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the
source of the recent panics occuring around kldloading file system
support modules.

Requested by: rwatson


# 92927 22-Mar-2002 arr

- Back out the commit to make the linker_load_file() securelevel check
made aware in jail environments. Supposedly something is broken, so
this should be backed out until further investigation proves otherwise,
or a proper fix can be provided.


# 92884 21-Mar-2002 arr

- Fix a logic error in checking the securelevel that was introduced in the
previous commit.

Pointy hats to: arr, rwatson


# 92803 20-Mar-2002 arr

- Change a check of securelevel to securelevel_gt() call in order to help
against users within a jail attempting to load kernel modules.
- Add a check of securelevel_gt() to vfs_mount() in order to chop some
low hanging fruit for the repair of securelevel checking of linking and
unlinking files from within jails. There is more to be done here.

Reviewed by: rwatson


# 92705 19-Mar-2002 arr

- Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag.


# 92547 18-Mar-2002 arr

- Lock down the ``module'' structure by adding an SX lock that is used by
all the global bits of ``module'' data. This commit adds a few generic
macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways
of accessing the SX lock. It is also the first step in helping to lock
down the kernel linker and module systems.

Reviewed by: jhb, jake, smp@


# 92032 10-Mar-2002 dwmalone

Don't assign strcmp to a variable called err and then compare it
with zero, just compare strcmp with zero. This fixes the same bug
which Maxim just fixed and fixes some odd style too.

PR: 35712
Reviewed by: arr


# 92016 10-Mar-2002 sobomax

Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results
in "missing dependencies" error when loading some kld modules. It is sad to
see how often these days style cleanus break doesn't broken things. Perhaps
people should recall good old principle: "don't fix it if it isn't broken".


# 91406 27-Feb-2002 jhb

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


# 91068 22-Feb-2002 arr

- Whitespace fixes leftover from previous commit.

Submitted by: bde


# 91040 22-Feb-2002 arr

- Massive style fixup.

Reviewed by: mike
Approved by: dfr


# 90483 10-Feb-2002 rwatson

Add a comment indicating that the vnode locking in this section of the
kernel linker code may be wrong: it fails to hold a lock across the
call to VOP_GETATTR(), and vn_rdwr() with IO_NODELOCKED.


# 86553 18-Nov-2001 arr

- Ensure that linker file id's are unique, rather than blindly
incrementing the value.

Reviewed by: dfr, peter


# 86469 16-Nov-2001 iedowse

Fix a number of misspellings of "dependency" and "dependencies" in
comments and function names.

PR: kern/8589
Submitted by: Rajesh Vaidheeswarran <rv@fore.com>


# 85844 01-Nov-2001 rwatson

o Move suser() calls in kern/ to using suser_xxx() with an explicit
credential selection, rather than reference via a thread or process
pointer. This is part of a gradual migration to suser() accepting
a struct ucred instead of a struct proc, simplifying the reference
and locking semantics of suser().

Obtained from: TrustedBSD Project


# 85736 30-Oct-2001 green

Add the sysctl "kern.function_list", which currently exports all
function symbols in the kernel in a list of C strings, with an extra
nul-termination at the end.

This sysctl requires addition of a new linker operation. Now,
linker_file_t's need to respond to "each_function_name" to export
their function symbols.

Note that the sysctl doesn't currently allow distinguishing multiple
symbols with the same name from different modules, but could quite
easily without a change to the linker operation. This will be a nicety
to have when it can be used.

Obtained from: NAI Labs CBOSS project
Funded by: DARPA


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 83358 11-Sep-2001 peter

Fix the kern.module_path issue that required the trailing '/' character
on each module path component. Fix a one-byte buffer overflow at the
same time that got highlighted in the process.


# 83321 10-Sep-2001 peter

Implement the long-awaited module->file cache database. A userland
tool (kldxref(8)) keeps a cache of what modules and versions are inside
what .ko files. I have tested this on both Alpha and i386.

Submitted by: bp


# 82749 01-Sep-2001 dillon

Giant Pushdown. Saved the worst P4 tree breakage for last.

reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit()
setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp()
getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid()
setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid()
setresuid() setresgid() getresuid() getresgid () __setugid() getlogin()
setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload()
kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize()
dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf()
flock()


# 81937 19-Aug-2001 dd

Sync the default module search path with the one in
sys/boot/common/module.c.

PR: 21405
Submitted by: Makoto MATSUSHITA <matusita@jp.FreeBSD.org>


# 80701 31-Jul-2001 jake

Don't try to print a field that doesn't exist; in usually commented
out debugging code.


# 78501 20-Jun-2001 des

Constify (silence warnings introduced by last commit to sys/module.h)


# 78413 18-Jun-2001 brian

Add linker_reference_module().

This function loads a module if required, otherwise bumps the reference
count -- the opposite of linker_file_unload().


# 78161 13-Jun-2001 peter

With this commit, I hereby pronounce gensetdefs past its use-by date.

Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by: eivind


# 77843 06-Jun-2001 peter

Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros. TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.


# 74642 22-Mar-2001 bp

o Actually extract version of interface and store it along with the name.

o Add new parameter to the modlist_lookup() function to perform lookups
with strict version matching.

o Collapse duplicate code to function(s).


# 74641 22-Mar-2001 bp

Slightly reorganize code in the linker_load_dependancies() function to make
codepath more straightforward.


# 74639 22-Mar-2001 bp

Remove support for old way of handling module dependencies.

Approved by: peter


# 72012 04-Feb-2001 phk

Another round of the <sys/queue.h> FOREACH transmogriffer.

Created with: sed(1)
Reviewed by: md5(1)


# 71999 04-Feb-2001 phk

Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)


# 70417 28-Dec-2000 peter

Pull out the module path from the loader. ie: if you boot from
/boot/kernel.foobar/* then that had better be in the path ahead of the
others.

Submitted by: Daniel J. O'Connor <darius@dons.net.au>
PR: 23662


# 69781 08-Dec-2000 dwmalone

Convert more malloc+bzero to malloc+M_ZERO.

Submitted by: josh@zipperup.org
Submitted by: Robert Drehmel <robd@gmx.net>


# 66630 04-Oct-2000 dfr

Add a workaround for statically linked kernels.


# 65507 06-Sep-2000 obrien

The kernel is now known as `kernel.ko' and it and its matching modules
live in ``/boot/kernel/''.

Submitted by: Hisashi Hiramoto <hiramoto@phys.chs.nihon-u.ac.jp>


# 64143 02-Aug-2000 peter

Fix self referential dependencies. eg: uhub was packaged along with
usb, all in usb.ko. uhub depends on usb. The bug was that the preload
processing only adds a module to the list once it's internal dependencies
are resolved... Since it was not "seeing" the internal usb module it
believed that uhub had a missing dependency.


# 62860 09-Jul-2000 bp

Correct SYSINIT execution order in the case when KLD contains more
than one SYSINIT with the same 'subsystem' id and different 'order' id.

Reviewed by: peter


# 62550 04-Jul-2000 mckusick

Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from: BSD/OS


# 62261 29-Jun-2000 archie

Move the securelevel check before loading KLD's into linker_load_file(),
instead of requiring every caller of linker_load_file() to perform the
check itself. This avoids netgraph loading KLD's when securelevel > 0,
not to mention any future code that may call linker_load_file().

Reviewed by: dfr


# 60938 26-May-2000 jake

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 60833 23-May-2000 jake

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 59794 30-Apr-2000 phk

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


# 59751 29-Apr-2000 peter

First round implementation of a fine grain enhanced module to module
version dependency system. This isn't quite finished, but it is at a
useful stage to do a functional checkpoint.

Highlights:
- version and dependency metadata is gathered via linker sets, so things
are handled the same for static kernels and code built to live in a kld.
- The dependencies are at module level (versus at file level).
- Dependencies determine kld symbol search order - this means that you
cannot link against symbols in another file unless you depend on it. This
is so that you cannot accidently unload the target out from underneath
the ones referencing it.
- It is flexible enough that we can put tags in #include files and macros
so that we can get decent hooks for enforcing recompiles on incompatable
ABI changes. eg: if we change struct proc, we could force a recompile
for all kld's that reference the proc struct.
- Tangled dependency references at boot time are sorted. Files are
relocated once all their dependencies are already relocated.

Caveats:
- Loader support is incomplete, but has been worked on seperately.
- Actual enforcement of the version number tags is not active yet - just
the module dependencies are live. The actual structure of versioning
hasn't been agreed on yet. (eg: major.minor, or whatever)
- There is some backwards compatability for old modules without metadata
but I'm not sure how good it is.

This is based on work originally done by Boris Popov (bp@freebsd.org),
but I'm not sure he'd recognize much of it now. Don't blame him. :-)
Also, ideas have been borrowed from Mike Smith.


# 59603 24-Apr-2000 dfr

* Rewrite to use kobj(9) instead of hard-coded function tables.
* Report link errors to stdout with uprintf() so that the user can see
what went wrong (PR kern/9214).
* Add support code to allow module symbols to be loaded into GDB using
the debugger's "sharedlibrary" command.


# 54655 15-Dec-1999 eivind

Introduce NDFREE (and remove VOP_ABORTOP)


# 54411 10-Dec-1999 peter

Zap c_index() and c_rindex(). Bruce prefers these to implicitly convert
a const into a non-const as they do in libc. I feel that defeating the
type checking like that quite evil, but that's the way it is.


# 53492 21-Nov-1999 peter

Tempt fate and stop index from converting a const char * into a char *.
I've made a seperate version (c_index() etc) that use const/const, but
I'm not sure it's worth it considering there is one file in the tree
that uses index on const strings (kern_linker.c) and it's easily adjusted
to scan the strings directly (and is perhaps more efficient that way).


# 52128 11-Oct-1999 peter

Trim unused options (or #ifdef for undoc options).

Submitted by: phk


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 50272 23-Aug-1999 bde

Cast pointers to uintptr_t instead of casting them to u_long. They
are still converted to u_long by assignment of the uintptr_t, and
address calculations are still done using u_long. This is OK for
currently supported machines, but addresses should be represented
by vm_offset_t or uintptr_t in case pointers are longer than longs.

"Fixed" size of linker_path[]. MAXPATHLEN + 1 was 1 too large for
search paths with only one file path in them, but much too small
for search paths with several long file paths in them.


# 50068 19-Aug-1999 grog

Change the name of the static variable 'files' to 'linker_files' in
order to be able to refer to it uniquely from the kernel debugger.

Approved-by: peter


# 48391 01-Jul-1999 peter

Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface. kproc_start is still available
as a SYSINIT() hook. This allowed simplification of chunks of the
sysinit code in the process. This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process. It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit. This means that nfsiod
doesn't need to be in /sbin and is always "available". This is a fair bit
easier to do outside of the SYSINIT_KT() framework.


# 48377 30-Jun-1999 peter

Slight tweak to fork1() calling conventions. Add a third argument so
the caller can easily find the child proc struct. fork(), rfork() etc
syscalls set p->p_retval[] themselves. Simplify the SYSINIT_KT() code
and other kernel thread creators to not need to use pfind() to find the
child based on the pid. While here, partly tidy up some of the fork1()
code for RF_SIGSHARE etc.


# 46693 08-May-1999 peter

First stages of a module dependency cleanup. This part fixes a
particularly annoying hack, namely having the linker bash the moduledata
to set the container pointer, preventing it being const. In the process,
a stack of warnings were fixed and will probably allow a revisit of the
const C_SYSINIT() changes. This explicitly registers modules in files or
preload areas with the module system first, and let them initialize via
SYSINIT/DECLARE_MODULE later in their SI_ORDER_xxx order. The kludge of
finding the containing file is no longer needed since the registration
of modules onto the modules list is done in the context of initializing
the linker file.


# 46129 27-Apr-1999 luoqi

Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
is added to point to per-cpu pages, per-cpu global variables are now
accessed through this new selector (%fs). The selectors in gdt table are
rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by: Alan Cox <alc@cs.rice.edu>
John Dyson <dyson@iquest.net>
Julian Elischer <julian@whistel.com>
Bruce Evans <bde@zeta.org.au>
David Greenman <dg@root.com>


# 46112 27-Apr-1999 phk

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 45356 06-Apr-1999 peter

LK_RETRY is a vn_lock() flag, not one for lockmgr().


# 44549 07-Mar-1999 dfr

* Register sysctl nodes before running sysinits when loading files and
unregister them after sysuninits when unloading.
* Add code to vfs_register() to set the oid number of vfs sysctls to
the type number of the filesystem.

Reviewed by: bde


# 44173 20-Feb-1999 dfr

A correction to the code which attempts to prevent the same module
being loaded twice. It used rindex() to strip the pathname but failed
to account for the fact that rindex() will return a pointer to the '/',
not the first character of the filename.

Submitted by: Nick Hibma <hibma@skylink.it>


# 44078 16-Feb-1999 dfr

* Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)


# 43309 27-Jan-1999 dillon

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile.

This commit includes significant work to proper handle const arguments
for the DDB symbol routines.


# 43301 27-Jan-1999 dillon

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# 43185 25-Jan-1999 dfr

Don't try to call SYSUNINIT functions if there was a link error.

Reviewed by: Peter Wemm <peter@netplex.com.au>


# 43084 23-Jan-1999 peter

Update userref handling after discussion with submitter of previous
patch. lf can't be dereferenced after the unload attempt, in case it
was freed. Instead, decrement first and back it out if the unload failed.
This should be relatively immune to races caused by the user since the
userref count will be zero for the duration of the actual unloading and
will stop further kldunload attempts.

Submitted by: Ustimenko Semen <semen@iclub.nsu.ru>


# 42849 19-Jan-1999 peter

Relax linkage symbol scope restrictions to be more compatable with that
of shared libraries.


# 42837 19-Jan-1999 peter

Don't decrement userrefs unless the file was actually was unloaded.

Submitted by: Ustimenko Semen <semen@iclub.nsu.ru>


# 42755 17-Jan-1999 peter

Try and clean up the multiple formal loading support a bit, based on
suggestions from Greg Lehey some time ago. In the face of multiple
potential file formats, try and give a more sensible error than just
ENOEXEC.

XXX a good case can be made that the loading process is wrong - the linker
should locate the file first (using the search paths etc), then run the
loaders to see if they recognize it. While the present system allows for
the possibility of different search paths for different formats, we do not
use it and it just makes things more complicated than they need to be.


# 42316 05-Jan-1999 msmith

Don't allow more than one module with the same name to be loaded.
Make kldfind ignore the path when searching for a loaded module.

Submitted by: John Birrell (jb@freebsd.org)


# 41090 11-Nov-1998 peter

kldsym(2) prototype implementation


# 41055 10-Nov-1998 peter

Arrange for unload-time linker set hooks to be called. While cut/pasting
some code, I changed the original to be consistant with the rest of the
file rather than duplicating the problems.


# 40961 06-Nov-1998 peter

Define the kld_debug variable if KLD_DEBUG is enabled


# 40906 04-Nov-1998 peter

The handle for the kernel is common. With this fix, ELF kernels can load
a.out kld modules, and a.out kernels can load ELF kld modules.


# 40861 03-Nov-1998 peter

Have the in-kernel linker try a default extension of .ko. This means that
"kldload nfs" works. We use the same default extension in the /boot/loader
system.


# 40859 03-Nov-1998 peter

Use the kvm space pathname that we copied in, not the one in user space.


# 40625 24-Oct-1998 msmith

Don't put 0x in front of %p, it does it already.
Submitted by: Brian Feldman <green@janus.syracuse.net>


# 40395 15-Oct-1998 peter

- bzero() after malloc(). This is especially obvious when kern_malloc is
compiled with DIAGNOSTIC.
- Don't break from the preload module processing loop prematurely.


# 40162 10-Oct-1998 peter

Display module type as well as module name when we find one preloaded.


# 40159 09-Oct-1998 peter

Use Mike Smith's linker module search path code.
Implement preloading in a fairly MI way, assuming the information is
prepared.
DDB interface helpers.. Provide some support for db_kld.c so that we
don't have to export too much detail.
Debugging and cosmetic nits left in from development..
The other half of the containing file hack so modules can associate
themselves with their "file".


# 38275 12-Aug-1998 dfr

Modify the internal interfaces to the kernel linker to make it possible
for DDB to use its symbol tables.


# 32153 01-Jan-1998 bde

Use a real malloc type for M_LINKER instead of #defining it as M_TEMP.

Fixed a comment.


# 31675 12-Dec-1997 dyson

We have had support for running the kernel daemons as threads for
quite a while, but forgot to do so. For now, this code supports
most daemons running as kernel threads in UP kernels, and as
full processes in SMP. We will soon be able to run them as
threads in SMP, but not yet.


# 31324 20-Nov-1997 bde

Fixed a sloppy common-style definitions.


# 30994 06-Nov-1997 phk

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


# 27845 02-Aug-1997 bde

Removed unused #includes.


# 25537 07-May-1997 dfr

This is the kernel linker. To use it, you will first need to apply
the patches in freefall:/home/dfr/ld.diffs to your ld sources and set
BINFORMAT to aoutkld when linking the kernel.

Library changes and userland utilities will appear in a later commit.