History log of /freebsd-10.0-release/sys/powerpc/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
259065 07-Dec-2013 gjb

- Copy stable/10 (r259064) to releng/10.0 as part of the
10.0-RELEASE cycle.
- Update __FreeBSD_version [1]
- Set branch name to -RC1

[1] 10.0-CURRENT __FreeBSD_version value ended at '55', so
start releng/10.0 at '100' so the branch is started with
a value ending in zero.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

257292 28-Oct-2013 nwhitehorn

MFC r256777-256779,256788:
Add driver for POWER hypervisor interpartition ethernet.

Approved by: re (glebius)


256857 21-Oct-2013 andreast

MFC: 256555

Move the resource allocation from the ata_*_probe section to the ata_*_attach
section. This prevents a boot crash on nearly all iMacs and PowerMacs/Books.

The allocation in the probe section was working before because ata_probe was
returning 0 which did not invoke a second DEVICE_PROBE. Now it returns
a BUS_PROBE_DEFAULT which can invoke a second DEVICE_PROBE which results in
a "failed to reserve resource" exit.

PR: powerpc/182978
Approved by: re(gjb)


256283 10-Oct-2013 gjb

- Remove debugging from GENERIC* kernel configurations
- Enable MALLOC_PRODUCTION
- Default dumpdev=NO
- Remove UPDATING entry regarding debugging features
- Bump __FreeBSD_version to 1000500

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


256007 02-Oct-2013 nwhitehorn

Implement GET_STACK_USAGE() on PowerPC. This implementation is identical
to that on x86 and sparc64.

Approved by: re (kib)


255943 29-Sep-2013 nwhitehorn

Changes to CAM or busdma have caused SIMs to be sent 0-length CCBs on
occasion. This resulted in zero mapped segments, triggering an assert in
the PS3 CDROM driver. Allow no DMA for 0-length transfers.

Approved by: re (glebius)
MFC after: 1 week


255927 28-Sep-2013 nwhitehorn

Add driver for the PAPR VSCSI virtual SCSI controller. This lets FreeBSD
install directly into standard POWER LPARs, as found for example in
QEMU. The core of this device is the SCSI RDMA protocol as also found in
Infiniband. The SRP portions of the driver will be factored out and placed
/sys/cam in the future to allow them to be used for IB storage. Thanks to
Scott Long for a great deal of implementation help.

Reviewed by: scottl
Approved by: re (kib)


255910 27-Sep-2013 nwhitehorn

Rework handling of ofw_quiesce(), making it the responsibility of the
platform modules. Whether to call this function or not is highly machine
dependent: on some systems, it is required, while on others it breaks
everything. Platform modules are in a better position to figure this
out. This is required for POWER hypervisor SCSI to work correctly. There
are no functional changes on Powermac systems.

Approved by: re (kib)


255909 27-Sep-2013 nwhitehorn

Make sure that ref and changed bits propagate back to the VM layer
whenever pages are unmapped. The old code had several races that could
allow these to become stale.

Approved by: re (kib)


255904 26-Sep-2013 nwhitehorn

Allow Open Firmware syscons to attach to devices without an "address"
property such as those found on some real and emulated IBM systems. The
approach, which is taken from Linux, is to scan through the PCI bars
until we find one large enough to contain the linear framebuffer and
which is ideally prefetchable if no "address" property can be found.
This makes the graphical console work with the pSeries target in QEMU.

Approved by: re (delphij)


255903 26-Sep-2013 nwhitehorn

As idling the CPU no longer causes hangs on QEMU, reenable the pSeries
cpu_idle() routine.

Approved by: re (delphij)


255895 26-Sep-2013 nwhitehorn

Fix bug where UART unit number was not set properly, which prevented
operation on systems with multiple serial ports. Also turn on
interrupts for the UART device, which were disabled due to a
now-fixed bug in QEMU.

Approved by: re (gjb)


255887 26-Sep-2013 alc

Eliminate the declaration for a method that is no longer used. (This
change should have been a part of r255724.)

Reminded by: nathan
Approved by: re (gjb)


255786 22-Sep-2013 glebius

- Create kern.ipc.sendfile namespace, and put the new "readhead" OID
there as "kern.ipc.sendfile.readahead".
- Push all nsfbuf related tunables into MD code. Don't move them
to new namespace in favor of POLA.

Reviewed by: scottl
Approved by: re (gjb)


255724 20-Sep-2013 alc

The pmap function pmap_clear_reference() is no longer used. Remove it.

pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8. Now, that call no
longer exists.

Approved by: re (kib)
Sponsored by: EMC / Isilon Storage Division


255643 17-Sep-2013 nwhitehorn

Merge in support for PAPR-compliant (Power Architecture Platform
Requirements) systems from the projects/pseries branch. This in principle
includes all IBM POWER hardware released in the last 15 years with the
exception of POWER3-based systems when run in 64-bit mode. The main
development target, however, has been the PAPR logical partition support
that is the default target in KVM on POWER and QEMU -- mileage may vary
on actual hardware at present. Much of the heavy lifting here was done
by Andreas Tobler.

Approved by: re (kib)


255642 17-Sep-2013 nwhitehorn

Only attach if properties we need (address, in particular) are present.
This is the correct version of r255420.

Approved by: re (kib)


255640 17-Sep-2013 nwhitehorn

Add POWER7+ and POWER8 to the CPU ID table.

Approved by: re (kib)


255639 17-Sep-2013 nwhitehorn

Make sure to copy segments back to the segs array if non-NULL. This is
relied upon by bus_dmamap_load_mbuf_sg() (i.e. all network drivers).

Approved by: re (kib)
MFC after: 2 weeks


255615 16-Sep-2013 nwhitehorn

Add a loader tunable to use only device tree-provided PCI devices. This is
needed on some more fragile systems to avoid machine checks when blindly
probing the PCI bus. Also reduce ofw_pcibus's priority slightly so that it
can be overridden.

Approved by: re (gjb)


255614 16-Sep-2013 nwhitehorn

Fix bug in busdma: if segs is a preexisting buffer, we memcpy it
into the DMA map. The length of the buffer had not yet been
initialized, however, so this would copy gibberish unless it
happened to be right by chance. This bug mostly only affected
systems with IOMMUs.

Approved by: re (gjb)
MFC after: 3 days


255596 15-Sep-2013 nwhitehorn

Add a kernel interface (OF_xref_phandle()) for systems where phandles
used as cross-references in the device tree and phandles as used by the
Open Firmware client interface are in different namespaces. This include
IBM pSeries hardware as well as FDT systems. FDT certainly abuses
ihandles for this purpose and should be modified to use this API
eventually. This changes no behavior on systems where FreeBSD already
worked.

Reviewed by: marius
Approved by: re (kib)
MFC after: 2 weeks


255503 13-Sep-2013 nwhitehorn

Change VM object lock assertion to match locking higher in the call
chain. This repairs a panic observed during pageout on some 64-bit
PowerPC systems.

Submitted by: grehan
Approved by: re (kib)
MFC after: 2 weeks
Revisit after: 10.0


255421 09-Sep-2013 nwhitehorn

Revert r255420. This seems to break some Powermac systems and will be
revisited much later.

Pointy hat to: me
Approved by: re (kib, implicit due to breakage 10 minutes ago)


255420 09-Sep-2013 nwhitehorn

Attach only on hardware that is actually supported as opposed to hardware
that seems like it has some of the problems we might want.

Approved by: re (kib)


255419 09-Sep-2013 nwhitehorn

Raise artificial limits on number of CPUs and number of interrupts.

Approved by: re (kib)


255418 09-Sep-2013 nwhitehorn

Add POWER CPUs to the kernel's knowledge. This does not imply we currently
actually run on any machines with POWER CPUs but avoids closing that door
unnecessarily.

Approved by: re (kib)


255417 09-Sep-2013 nwhitehorn

Add hook called when every new processor is brought online -- including the
BSP -- so that platform modules have a chance to add the new CPU to any
internal bookkeeping.

Approved by: re (kib)


255416 09-Sep-2013 nwhitehorn

Use a spin lock instead of a mutex to gate RTAS. This is required if RTAS
calls are involved in interrupt handling.

Approved by: re (kib)


255415 09-Sep-2013 nwhitehorn

Use the canonical bits for wired, etc. in the PTE. This is important for
interactions with certain kinds of hypervisors that look into the PTEs
more closely than they should.

Approved by: re (kib)


255378 07-Sep-2013 nwhitehorn

Fix error in r252115: space for the softc needs to be allocated. This
seemed to be working by chance on most systems.


255318 06-Sep-2013 glebius

Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations
to MD headers.


255282 05-Sep-2013 nwhitehorn

Also align the 32-bit PowerPC stacks.


255273 05-Sep-2013 nwhitehorn

Align stacks of kernel threads correctly at 16-byte boundaries rather than
making sure they are all misaligned at +8 bytes. This fixes clang builds
of powerpc64 kernels (aside from a required increase in KSTACK_PAGES which
will come later).

This commit from FreeBSD/powerpc64 with a clang-built kernel.

MFC after: 2 weeks


255194 03-Sep-2013 imp

Newer versions of gcc define __INT64_C and __UINT64_C, so avoid
redefining them if gcc provides them.


255165 03-Sep-2013 jhibbits

Enable PMC interrupt handling, and fix a DTrace trap handling bug.


255164 03-Sep-2013 jhibbits

Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoring
will likely be done as more drivers are added, since AIM-compatible processors
have similar PMC configuration logic.


255100 31-Aug-2013 jhibbits

Only add the backlight device if it actually exists in OF.

MFC after: 1 week


255028 29-Aug-2013 alc

Significantly reduce the cost, i.e., run time, of calls to madvise(...,
MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE. Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference(). Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2). A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).

Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact. For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.

Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures. I will commit implementations for the
remaining architectures after further testing. For now, a stub function is
sufficient because of the advisory nature of pmap_advise().

Discussed with: jeff, jhb, kib
Tested by: pho (i386), marcel (ia64)
Sponsored by: EMC / Isilon Storage Division


254737 23-Aug-2013 andreast

Return EIO iso -1, the kiic_transfer has an signed return.

Submitted by: Luiz Otavio O Souza <loos.br AT gmail.com>


254667 22-Aug-2013 kib

Revert r254501. Instead, reuse the type stability of the struct pmap
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone. Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().

Suggested and reviewed by: alc (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation


254639 22-Aug-2013 jhibbits

Enable DTrace hooks in ppc64.


254480 18-Aug-2013 pjd

Add process descriptors support to the GENERIC kernel. It is already being
used by the tools in base systems and with sandboxing more and more tools
the usage should only increase.

Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org>
Sponsored by: Google Summer of Code 2013
MFC after: 1 month


254138 09-Aug-2013 attilio

The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
and vm_page_grab are being executed. This will be very helpful
once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by: EMC / Isilon storage division
Discussed with: alc
Reviewed by: jeff, kib
Tested by: gavin, bapt (older version)
Tested by: pho, scottl


254133 09-Aug-2013 avg

follow up to r254051

- update powerpc/GENERIC64 as well, suggested by mdf
- update comments so that they make sense after the change, suggested by
jhb

X-MFC after: never (change specific to head)


254051 07-Aug-2013 avg

enable KDB_TRACE in GENERICs

KDB_TRACE is not an alternative to DDB/etc, they are complementary.
So I do not see any reason to not enable KDB_TRACE by default.

X-MFC after: never (change specific to head)


254025 07-Aug-2013 jeff

Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division


253979 06-Aug-2013 jhibbits

Micro-optimize OFW syscons 8-bit blank.

MFC after: 1 week


253978 06-Aug-2013 jhibbits

Remove an unnecessary panic. The PVO's PTE entry and the PTEG's PTE entry may
not match, if the PVO's PTE is invalid.


253976 06-Aug-2013 jhibbits

Evict pages from the PTEG when it's full and trying to insert a new PTE,
rather than panicking.

Reviewed by: nwhitehorn
MFC after: 3 weeks


253918 03-Aug-2013 jhibbits

Remove duplicate definition of SPR MMCR0.

MFC after: 3 days


253845 31-Jul-2013 obrien

Back out r253779 & r253786.


253825 31-Jul-2013 jhibbits

Add the macio attachment for wi(4). Partially obtained from NetBSD.

Reviewed by: adrian
Obtained from: NetBSD (partially)


253779 29-Jul-2013 obrien

Decouple yarrow from random(4) device.

* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.

* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.

* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah

* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.

* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.

* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.

Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien


253750 28-Jul-2013 avg

Revert r253748,253749

This WIP should not have been committed yet.

Pointyhat to: avg


253748 28-Jul-2013 avg

put contents of cpu.h under _KERNEL

no userland-serviceable parts inside

MFC after: 20 days


253588 24-Jul-2013 jhibbits

Increase the size of the OFW bounce buffer to 4 pages. With this I can now run
'ofwdump -ap' on my quad G5.

MFC after: 9.2 branch


253367 15-Jul-2013 ae

Include sys/systm.h after sys/param.h.

Suggested by: pluknet


253351 15-Jul-2013 ae

Introduce new structure sfstat for collecting sendfile's statistics
and remove corresponding fields from struct mbstat. Use PCPU counters
and SFSTAT_INC() macro for update these statistics.

Discussed with: glebius


253272 12-Jul-2013 nwhitehorn

Fix check: bitwise and has only one &.

MFC after: 1 week


252500 02-Jul-2013 rpaulo

Fix indentation.

Submitted by: jmallet


252499 02-Jul-2013 rpaulo

Add register definitions for the Wii IPC system.


252434 01-Jul-2013 kib

Fix issues with zeroing and fetching the counters, on x86 and ppc64.
Issues were noted by Bruce Evans and are present on all architectures.

On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.

On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.

PowerPC64 is treated the same as amd64.

For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.

Noted by: bde
Sponsored by: The FreeBSD Foundation


252115 23-Jun-2013 jhibbits

Cache the Open Firmware CPU properties at attach time, so we don't always
enter it at runtime to get static data.


251900 18-Jun-2013 rpaulo

Fix a KTR_BUSDMA format string.


251356 04-Jun-2013 jhibbits

Pad the PCPU MD struct, to satisfy an assert added with the projects/counters
branch import.

PR: ports/179173,ports/179164


250884 21-May-2013 attilio

o Relax locking assertions for vm_page_find_least()
o Relax locking assertions for pmap_enter_object() and add them also
to architectures that currently don't have any
o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade
operation on the per-object rwlock
o Use all the mechanisms above to make vm_map_pmap_enter() to work
mostl of the times only with readlocks.

Sponsored by: EMC / Isilon storage division
Reviewed by: alc


250864 21-May-2013 marcel

Fix the PowerPC Book-E register definitions used by the remote GDB
protocol.

Obtained from: Juniper Networks, Inc.


250788 18-May-2013 rpaulo

Add support for the second GPIO pin bank on the Wii and add support for
shutting down the system.


250747 17-May-2013 alc

Relax the object locking assertion in pmap_enter_locked().

Reviewed by: attilio
Sponsored by: EMC / Isilon Storage Division


250544 12-May-2013 peter

Tidy up some CVS workarounds.


250338 07-May-2013 attilio

Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.

Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc


250290 05-May-2013 nwhitehorn

Only check fan type once. Not only is continuously rechecking pointless, a
single random failure can reprogram what control mechanism we try to use.

MFC after: 2 weeks


249973 27-Apr-2013 rpaulo

Add reset support to the Wii.


249965 27-Apr-2013 rpaulo

Fix the frambuffer issues by calling pmap_mapdev() in the attach routine. This
will make the framebuffer region uncacheable and it will create a proper
KVA -> RAM mapping.


249918 26-Apr-2013 jhibbits

Remove a comment that shouldn't have gone in.

X-MFC-with: r249864


249864 25-Apr-2013 jhibbits

Introduce kernel coredumps to ppc32 AIM. Leeched from the booke code.

MFC after: 2 weeks


249826 24-Apr-2013 rpaulo

Handle the IRQ for the reset button.


249718 21-Apr-2013 rpaulo

Fix an off by one calculation in wiipic_dispatch().


249390 11-Apr-2013 bz

Generate a LINT for powerpc and for powerpc64.

Discussed with: nwhitehorn


249335 10-Apr-2013 glebius

Since UMA_ZONE_PCPU zones put a constraint on sizeof(struct pcpu), declared
as CTASSERT in MI pcpu.h, stop including all possible mutually exclusive
PCPU_MD_FIELDS fields into LINT kernels, due to brekaing
aforementioned CTASSERT.


249304 09-Apr-2013 kib

Fix build for AIM 64bit.


249268 08-Apr-2013 glebius

Merge from projects/counters: counter(9).

Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.

See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.

In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.


249265 08-Apr-2013 glebius

Merge from projects/counters:

Pad struct pcpu so that its size is denominator of PAGE_SIZE. This
is done to reduce memory waste in UMA_PCPU_ZONE zones.

Sponsored by: Nginx, Inc.


249213 06-Apr-2013 marius

- With the demise of !ATA_CAM, ATA_STATIC_ID is the only ata(4) related
option left but actually consumed by ada(4), so move it to opt_ada.h
and get rid of opt_ata.h.
- Fix stand-alone build of atacore(4) by adding opt_cam.h.
- Use __FBSDID.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.


249129 05-Apr-2013 jhibbits

Print out DSISR in a fatal DSI trap.

Sponsored by:


249083 04-Apr-2013 mav

Remove all legacy ATA code parts, not used since options ATA_CAM enabled in
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.

No objections: current@, stable@
MFC after: never


248508 19-Mar-2013 kib

Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA. The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer. For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer. The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings. Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by: The FreeBSD Foundation
Discussed with: jeff (previous version)
Tested by: pho, scottl (previous version), jhb, bf
MFC after: 2 weeks


248457 18-Mar-2013 jhibbits

Add FBT for PowerPC DTrace. Also, clean up the DTrace assembly code,
much of which is not necessary for PowerPC.

The FBT module can likely be factored into 3 separate files: common,
intel, and powerpc, rather than duplicating most of the code between
the x86 and PowerPC flavors.

All DTrace modules for PowerPC will be MFC'd together once Fasttrap is
completed.


248280 14-Mar-2013 kib

Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination. Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations. For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used. The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by: The FreeBSD Foundation
Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after: 2 weeks


248084 09-Mar-2013 attilio

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


247463 28-Feb-2013 mav

MFcalloutng:
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.


247454 28-Feb-2013 davide

MFcalloutng:
When CPU becomes idle, cpu_idleclock() calculates time to the next timer
event in order to reprogram hw timer. Return that time in sbintime_t to
the caller and pass it to acpi_cpu_idle(), where it can be used as one
more factor (quite precise) to extimate furter sleep time and choose
optimal sleep state. This is a preparatory change for further callout
improvements will be committed in the next days.

The commmit is not targeted for MFC.


247400 27-Feb-2013 attilio

Merge from vmobj-rwlock:
VM_OBJECT_LOCKED() macro is only used to implement a custom version
of lock assertions right now (which likely spread out thanks to
copy and paste).
Remove it and implement actual assertions.

Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Tested by: pho


247360 26-Feb-2013 attilio

Merge from vmc-playground branch:
Replace the sub-optimal uma_zone_set_obj() primitive with more modern
uma_zone_reserve_kva(). The new primitive reserves before hand
the necessary KVA space to cater the zone allocations and allocates pages
with ALLOC_NOOBJ. More specifically:
- uma_zone_reserve_kva() does not need an object to cater the backend
allocator.
- uma_zone_reserve_kva() can cater M_WAITOK requests, in order to
serve zones which need to do uma_prealloc() too.
- When possible, uma_zone_reserve_kva() uses directly the direct-mapping
by uma_small_alloc() rather than relying on the KVA / offset
combination.

The removal of the object attribute allows 2 further changes:
1) _vm_object_allocate() becomes static within vm_object.c
2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by
direct calls to mtx_init() as there is no need to export it anymore
and the calls aren't either homogeneous anymore: there are now small
differences between arguments passed to mtx_init().

Sponsored by: EMC / Isilon storage division
Reviewed by: alc (which also offered almost all the comments)
Tested by: pho, jhb, davide


247297 26-Feb-2013 attilio

Merge from vmobj-rwlock branch:
Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h.

Sponsored by: EMC / Isilon storage division
Tested by: pho
Reviewed by: alc


247153 22-Feb-2013 alc

Eliminate an unused #define.


246732 13-Feb-2013 rpaulo

Introduce PLATFORMMETHOD_END and use it.


246713 12-Feb-2013 kib

Reform the busdma API so that new types may be added without modifying
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.

The cam changes unify the bus_dmamap_load* handling in cam drivers.

The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.

Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)


246655 11-Feb-2013 rpaulo

Use DEVMETHOD_END.


245003 03-Jan-2013 kib

Enable the UFS quotas for big-iron GENERIC kernels.

Discussed with: mckusick
MFC after: 2 weeks


244992 03-Jan-2013 des

As discussed on -current last October, remove the firewire drivers from
GENERIC.


244179 13-Dec-2012 rpaulo

Add the common FreeBSD SVN properties.


243882 05-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


243370 21-Nov-2012 adrian

Setup BAT0 and BAT1 on the Wii.

This is the missing piece for FreeBSD/Wii, but there's still a lot of
work ahead. We have to reset the MMU in locore before continuing
the boot process because we don't know how the boot loaders might
have setup the BATs. We also disable the PCI BAT because there's no PCI
bus on the Wii.

Thanks to Nathan Whitehorn and Peter Grenhan for their help.

Submitted by: Margarida Gouveia


243040 14-Nov-2012 kib

Flip the semantic of M_NOWAIT to only require the allocation to not
sleep, and perform the page allocations with VM_ALLOC_SYSTEM
class. Previously, the allocation was also allowed to completely drain
the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT
request class for vm_page_alloc() and similar functions.

Allow the caller of malloc* to request the 'deep drain' semantic by
providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT
class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM
allocation class.

Centralize the translation of the M_* malloc(9) flags in the single
inline function malloc2vm_flags().

Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com>
Reviewed by: alc, mdf (previous version)
Tested by: pho (previous version)
MFC after: 2 weeks


242904 12-Nov-2012 rpaulo

Allow this file to be used in LOCORE sections of the kernel.


242737 08-Nov-2012 jhibbits

Add DTrace to 32-bit PowerPC GENERIC now.

MFC after: 1 month


242723 07-Nov-2012 jhibbits

Implement DTrace for PowerPC. This includes both 32-bit and 64-bit.

There is one known issue: Some probes will display an error message along the
lines of: "Invalid address (0)"

I tested this with both a simple dtrace probe and dtruss on a few different
binaries on 32-bit. I only compiled 64-bit, did not run it, but I don't expect
problems without the modules loaded. Volunteers are welcome.

MFC after: 1 month


242535 03-Nov-2012 alc

Replace all uses of the page queues lock by a R/W lock that is private
to this pmap.

Eliminate two redundant #include's.

Tested by: marcel


242534 03-Nov-2012 attilio

Rework the known rwlock to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct rwlock_padalign.

Reviewed by: alc, jimharris


242526 03-Nov-2012 marcel

1. Have the APs initialize the TLB1 entries from what has been
programmed on the BSP during (early) boot. This makes sure
that the APs get configured the same as the BSP, irrspective
of how FreeBSD was loaded.
2. Make sure to flush the dcache after writing the TLB1 entries
to the boot page. The APs aren't part of the coherency domain
just yet.
3. Set pmap_bootstrapped after calling pmap_bootstrap(). The
FDT code now maps the devices (like OF), and this resulted
in a panic.
4. Since we pre-wire the CCSR, make sure not to map chunks of
it in pmap_mapdev().


242315 29-Oct-2012 nwhitehorn

Work around broken device tree on last-generation PowerPC iMacs
(PowerMac12,1), which have a mac-io MPIC cell that indifies itself
as the root PIC despite the actual root PIC being on the northbridge.
No CPC945 systems have a mac-io PIC that does anything so just don't
attach on CPC945 (U4) systems.

MFC after: 3 days


241920 23-Oct-2012 rpaulo

Remove compat options.

Submitted by: netchild


241861 22-Oct-2012 rpaulo

Fix the memory regions to include the 64MB DDR3 memory slot.


241860 22-Oct-2012 rpaulo

Increase the I/O memory area to 0xc20000.


241821 21-Oct-2012 rpaulo

Fix the top comment.


241820 21-Oct-2012 rpaulo

Add a config file for the Wii.


241794 21-Oct-2012 rpaulo

Add "options WII".


241020 28-Sep-2012 alc

Eliminate a stale comment. It describes another use case for the pmap in
Mach that doesn't exist in FreeBSD.


240860 23-Sep-2012 nwhitehorn

Move the prototype for savectx from cpu.h to pcb.h, as it is on other
platforms, as well as putting it in an #ifdef KERNEL block.

MFC after: 2 weeks


240793 21-Sep-2012 andreast

Remove leftover from r215163.


240680 18-Sep-2012 gavin

Align the PCI Express #defines with the style used for the PCI-X
#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with: jhb
MFC after: 1 week


240244 08-Sep-2012 attilio

userret() already checks for td_locks when INVARIANTS is enabled, so
there is no need to check if Giant is acquired after it.

Reviewed by: kib
MFC after: 1 week


240224 08-Sep-2012 rpaulo

Add IRQ support to the resource list handling functions.


239682 25-Aug-2012 rpaulo

Unbreak tinderbox.


239666 25-Aug-2012 rpaulo

Set mdp only under #ifdef WII.


239548 21-Aug-2012 jhibbits

phandle_t is unsigned, and OF_finddevice() returns (phandle_t)(-1) on
failure, so check for that instead of 0.

While here, provide a better description for ATI backlight driver.

Reported by: jchandra@
MFC after: 3 days


239480 21-Aug-2012 adrian

On Nintendo Wii CPUs, the mdp value will be garbage. Set it to NULL
so as to not confuse things.

Submitted by: Margarida Gouveia


239479 21-Aug-2012 adrian

Don't probe the openfirmware framebuffer if the system is a Wii or it
will crash.

Submitted by: Margarida Gouveia


239478 21-Aug-2012 adrian

Initial support for running FreeBSD on the Nintendo Wii. We're able to
reach single user mode using a memory disk device as the file system.

This port includes the framebuffer driver, the PIC driver, a platform
driver and the GPIO driver. The IPC driver (to talk to IOS kernels) is
not yet written but there's a placeholder for it.

There are still some MMU problems and to get a working system you need to
patch locore32.S. Since we haven't found the best way yet to address that
problem, we're not committing those changes yet. The problem is related to
the different BAT layout on the Wii and to the fact that the Homebrew
loader doesn't clean up the special registers (including the 8 BATs)
before passing control to us.

You'll need a Wii with Homebrew loader and a TV that can do NTSC (for now).

Submitted by: Margarida Gouveia


239402 19-Aug-2012 andreast

Add the ds1631 temperature driver.


239027 04-Aug-2012 jhibbits

Add backlight support for nVidia-based PowerBooks/iBooks/iMacs.

Approved by: nwhitehorn (mentor)
MFC after: 9.1-RELEASE


239008 03-Aug-2012 jhb

Improve the handling of static DMA buffers that use non-default memory
attributes (currently just BUS_DMA_NOCACHE):
- Don't call pmap_change_attr() on the returned address, instead use
kmem_alloc_contig() to ask the VM system for memory with the requested
attribute.
- As a result, always use kmem_alloc_contig() for non-default memory
attributes, even for sub-page allocations. This requires adjusting
bus_dmamem_free()'s logic for determining which free routine to use.
- For x86, add a new dummy bus_dmamap that is used for static DMA
buffers allocated via kmem_alloc_contig(). bus_dmamem_free() can then
use the map pointer to determine which free routine to use.
- For powerpc, add a new flag to the allocated map (bus_dmamem_alloc()
always creates a real map on powerpc) to indicate which free routine
should be used.

Note that the BUS_DMA_NOCACHE handling in powerpc is currently #ifdef'd out.
I have left it disabled but updated it to match x86.

Reviewed by: scottl
MFC after: 1 month


238357 10-Jul-2012 alc

Avoid recursion on the pvh global lock in the aim oea pmap.

Correct the return type of the pmap_ts_referenced() implementations.

Reported by: jhibbits [1]
Tested by: andreast


238159 06-Jul-2012 alc

Replace all uses of the vm page queues lock by a r/w lock that is private
to this pmap.

Tested by: andreast, jhibbits


238045 03-Jul-2012 marcel

Support lbc interrupts:
o Save and clear the LTESR register in the interrupt handler.
o In lbc_read_reg(), return the saved LTESR register value if applicable
(i.e. when the saved value is not invalid (read: ~0U)).
o In lbc_write_reg(), clear the bits in the saved register when when it's
written to and when the asved value is not invalid.
o Also in lbc_write_reg(), the LTESR register is unlocked (in H/W) when
bit 1 of LTEATR is cleared. We use this to invalidate our saved LTESR
register value. Subsequent reads and write go to H/W directly.

While here:
o In lbc_read_reg() & lbc_write_reg(), add some belts and suspenders to
catch when register offsets are out of range.
o In lbc_attach(), initialize completely and don't leave something left
for lbc_banks_enable().


238042 02-Jul-2012 marcel

Properly implement the bus_config_intr, bus_setup_intr and bus_teardown_intr
methods so that MI drvers can depend on us doing the right thing instead
of having to go around us and call MD code directly. See the FDT code for
example (not for long though).


238034 02-Jul-2012 marcel

Remove device uart_z8530 and options GEOM_PART_APM from DEFAULTS and instead
add them to GENERIC and GENERIC64. They are applicable to Apple H/W and not
at all for Book-E platforms.


238033 02-Jul-2012 marcel

Fix a typo that resulted in or-ing PTE_UW twice whrn PTE_SW was needed.
Note that setting the PTE_MODIFIED bit based on whether write is possible
is incorrect. We should set PTE_MODIFIED based on whether the access
is a write operation.


238032 02-Jul-2012 marcel

Handle traps from the debugger. We need to catch them and re-enter
the debugger where they're being taken care of.


238031 02-Jul-2012 marcel

Invalidate any TLB1 entries we don't need. The firmware (e.g. U-Boot)
may have added entries that conflict with TLB0 entries.


238030 02-Jul-2012 marcel

Implement cpu_flush_dcache(). This allows us to optimize __syncicache()
for the common case in chich D-caches are coherent by virtue of busdma.


237936 01-Jul-2012 rpaulo

Remove bogus __unused attribute from hrowpic_eoi().


237878 01-Jul-2012 ken

Now that the mps(4) driver is endian-safe, add it to the powerpc 32-bit
GENERIC config file.

MFC after: 3 days
Reqested by: nwhitehorn


237818 29-Jun-2012 joel

Reduce diffs between GENERIC and GENERIC64. Also fix a few whitespace nits
while I'm here. No functional change.


237737 29-Jun-2012 rpaulo

The `end' symbol doesn't match the end of the kernel image because it's
relative to the start address (unless the start address is 0, which is
not the case).
This is currently not a problem because all powerpc architectures are
using loader(8) which passes metadata to the kernel including the
correct `endkernel' address. If we don't use loader(8), register 4
and 5 will have the size of the kernel ELF file, not its end address.
We fix that simply by adding `kernel_text' to `end' to compute
`endkernel'.

Discussed with: nathanw


237730 28-Jun-2012 ken

Now that the mps(4) driver is endian-safe, add it to the powerpc and
sparc64 GENERIC config files.

MFC after: 3 days


237517 24-Jun-2012 andrew

Make the wchar_t type machine dependent.

This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.

Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.

Discussed with: bde


237433 22-Jun-2012 kib

Implement mechanism to export some kernel timekeeping data to
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.

The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.

The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.

Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.

Minimal stubs neccessary for non-x86 architectures to still compile
are provided.

Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month


237430 22-Jun-2012 kib

Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to
timekeeping information.

MFC after: 1 week


237168 16-Jun-2012 alc

The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.

Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.

As an added bonus, tidy up some nearby comments concerning page flags.

Reviewed by: kib
MFC after: 6 weeks


236325 30-May-2012 raj

Panic openly if we cannot retrieve memory information from the device tree.
This is a critical condition and can lead to all sorts of misterious hangs if
not handled.

Obtained from: Semihalf
Also reported by: thompsa


236324 30-May-2012 raj

Extract vendor specific Book-E pieces into separate files and have a common
skeleton (maybe we should kobj-tize this one day).

Note the PPC4xx bit is not connected to the build yet.

Obtained from: AppliedMicro, Semihalf.


236142 27-May-2012 raj

Remove redundant check, we catch ULE platform support in common
sys/kern/sched_ule.c


236141 27-May-2012 raj

Let us manage differences of Book-E PowerPC variations i.e. vendor /
implementation specific vs. the common architecture definition.

Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.

This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.

Obtained from: AppliedMicro, Freescale, Semihalf


236121 26-May-2012 raj

Import eSDHC driver for Freescale integrated controller.

Obtained from: Freescale, Semihalf
Written by: Michal Dubiel


236119 26-May-2012 raj

Move OpenPIC FDT bus glue to a shared location, so that other PowerPC
platforms can use it, not only MPC85XX.

This is just reorg, no functional changes.


236098 26-May-2012 raj

Retrieve CPU number info from the device tree.

Obtained from: Freescale, Semihalf.


236097 26-May-2012 raj

Rename e500 prefix to match other Book-E CPU variations. CPU id tidbits for
the new cores.

Obtained from: Freescale, Semihalf.


236095 26-May-2012 raj

Provide SPR definitions for newer Book-E (E500mc, E5500, PPC465).

Obtained from: Freescale, Semihalf.


236094 26-May-2012 raj

Unify SPR defines formatting, no funtional changes.


236025 25-May-2012 raj

Update HID defines for E500mc and E5500 CPU cores.

Obtained from: Freescale, Semihalf


236019 25-May-2012 raj

Fix physical address type to vm_paddr_t also for powerpc64.


236000 25-May-2012 raj

Missing vm_paddr_t bits which should have been part of r235936.


235946 24-May-2012 bz

Add a missing " to get closer to compiling.


235943 24-May-2012 nwhitehorn

Atomic operation acquire barriers also need to be isync on 64-bit systems.


235942 24-May-2012 marcel

Revert isync for ILP32 to sync as per my original change that I discussed
with Nathan. Leave __ATOMIC_ACQ as an isync as per Nathan.


235941 24-May-2012 bz

MFp4 bz_ipv6_fast:

in_cksum.h required ip.h to be included for struct ip. To be
able to use some general checksum functions like in_addword()
in a non-IPv4 context, limit the (also exported to user space)
IPv4 specific functions to the times, when the ip.h header is
present and IPVERSION is defined (to 4).

We should consider more general checksum (updating) functions
to also allow easier incremental checksum updates in the L3/4
stack and firewalls, as well as ponder further requirements by
certain NIC drivers needing slightly different pseudo values
in offloading cases. Thinking in terms of a better "library".

Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems

Reviewed by: gnn (as part of the whole)
MFC After: 3 days


235937 24-May-2012 marcel

A few improvements:
1. Define all registers. These definitions are needed to support
the FCM driver for direct-connect NAND.
2. Repurpose lbc_read_reg() and lbc_write_reg() for use by localbus
attached device drivers. Use bus_space functions directly in the
lbc driver itself.
3. Be smarter about programming LAWs and mapping memory. The ranges
defined in the FDT are per bank (= chip select) and since we can
have up to 8 banks, we could easily use more than 8 LAWs or TLB
enrties when per-bank memory ranges need multiple LAWs or TLBs
due to alignment or size constraints.
We now combine all memory ranges into the fewest possible set of
contiguous regions and program the hardware for that. Thus, a
cleverly written FDT with 8 devices may still only need 1 LAW or
1 TLB entry. Note that the memory ranges can be assigned randomly
to the banks. We sort as we build to handle that.
4. Support the FCM when programming the OR register. This is mostly
for documention purposes as we do not have a way to define the
mode for a bank.
5. Remove Semihalf-ism: do not define DEBUG (only to undefine it
again).


235936 24-May-2012 raj

Fix physical address type to vm_paddr_t.


235935 24-May-2012 marcel

Remove Semihakf-ism. DEBUG is a kernel configuration option. It
should not be defined in source files.


235934 24-May-2012 marcel

Just return if the size of the window is 0. This can happen when the
FDT does not define all ranges possible for a particular node (e.g.
PCI).
While here, only update the trgt_mem and trgt_io pointers if there's
no error. This avoids that we knowingly write an invalid target (= -1).


235933 24-May-2012 marcel

Either the I/O port range or the memory mapped I/O range may not be
defined in the FDT. The range will have a zero size in that case.


235932 24-May-2012 marcel

o Rename kernload_ap to bp_kernelload. This to introduce a common prefix
for variables that live in the boot page.
o Add bp_trace (yes, it's in the boot page) that gets zeroed before we
try to wake a core and to which the core being woken can write markers
so that we know where the core was in case it doesn't wake up. The
boot code does not yet write markers (too follow).
o Disable the boot page translation to allow the last 4K page to be used
for whatever we please. It would get mapped otherwise.
o Fix kernstart in the case of SMP. The start argument is typically page
aligned due to the alignment requirements that come with having a boot
page. The point of using trunc_page is that we get the actual load
address given that the entry point is immediately following the ELF
headers. In the SMP case this ended up exactly 4K after the load
address. Hence subtracting 1 from start.


235931 24-May-2012 marcel

Fix the memory barriers for CPUs that do not like lwsync and wedge or cause
exceptions early enough during boot that the kernel will do ithe same.
Use lwsync only when compiling for LP64 and revert to the more proven isync
when compiling for ILP32. Note that in the end (i.e. between revision 222198
and this change) ILP32 changed from using sync to using isync. As per Nathan
the isync is needed to make sure I/O accesses are properly serialized with
locks and isync tends to be more effecient than sync.

While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file
so as not to leak their definitions.

Discussed with: nwhitehorn


235689 20-May-2012 nwhitehorn

Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies
range operations like pmap_remove() and pmap_protect() as well as allowing
simple operations like pmap_extract() not to involve any global state.
This substantially reduces lock coverages for the global table lock and
improves concurrency.


235013 04-May-2012 nwhitehorn

Fix final bugs in memory barriers on PowerPC:
- Use isync/lwsync unconditionally for acquire/release. Use of isync
guarantees a complete memory barrier, which is important for serialization
of bus space accesses with mutexes on multi-processor systems.
- Go back to using sync as the I/O memory barrier, which solves the same
problem as above with respect to mutex release using lwsync, while not
penalizing non-I/O operations like a return to sync on the atomic release
operations would.
- Place an acquisition barrier around thread lock acquisition in
cpu_switchin().


234785 29-Apr-2012 dim

Add a convenience macro for the returns_twice attribute, and apply it to
the prototypes of the appropriate functions (getcontext, savectx,
setjmp, sigsetjmp and vfork).

MFC after: 2 weeks


234760 28-Apr-2012 nwhitehorn

Fix build on 32-bit systems.


234745 28-Apr-2012 nwhitehorn

After switching mutexes to use lwsync, they no longer provide sufficient
guarantees on acquire for the tlbie mutex. Conversely, the TLB invalidation
sequence provides guarantees that do not need to be redundantly applied on
release. Roll a small custom lock that is just right. Simultaneously,
convert the SLB tree changes back to lwsync, as changing them to sync
was a misdiagnosis of the tlbie barrier problem this commit actually fixes.


234653 24-Apr-2012 nwhitehorn

Switch the default I/O memory barrier to eieio, as it should be. This
does not appear to cause any problems due to fixes elsewhere.

MFC after: 2 months


234652 24-Apr-2012 nwhitehorn

Revert r234581 for this file. The lockless SLB tree code does in fact need
a heavyweight sync instead of a lightweight sync to function properly.
Thanks to mdf for the clarification.


234615 23-Apr-2012 nwhitehorn

Fix copy-and-paste error in r230400.

MFC after: 3 days


234609 23-Apr-2012 nwhitehorn

Fix missing header for powerpc_iomb().

Pointy hat to: me


234590 22-Apr-2012 nwhitehorn

Provide a clearer split between read/write and acquire/release barriers.
This should really, actually be correct now.


234589 22-Apr-2012 nwhitehorn

Correctly specify assembler constrains for synchronization instructions.

MFC after: 3 days


234584 22-Apr-2012 nwhitehorn

Clarify what we are doing in r234583 a little better: eieio and isync do
not provide general barriers, but only barriers in the context of the
atomic sequences here. As such, make them private and keep the global
*mb() routines using a variant of sync.


234583 22-Apr-2012 nwhitehorn

On non-64-bit systems (which generally don't have lwsync), use eieio and
isync to implement read and write barriers, following Appendix B.2 of
Book II of the architecture manual. This provides a 25% speed increase
to fork() on the PowerPC G4.


234581 22-Apr-2012 nwhitehorn

Use lwsync to provide memory barriers on systems that support it instead
of sync (lwsync is an alternate encoding of sync on systems that do not
support it, providing graceful fallback). This provides more than an order
of magnitude reduction in the time required to acquire or release a mutex.

MFC after: 2 months


234580 22-Apr-2012 nwhitehorn

Remove dead code. The routines in atomic.S did not work properly anyway, and
were everywhere unused. If we turn out to need them, they should be
reimplemented.

MFC after: 2 weeks


234579 22-Apr-2012 nwhitehorn

Replace eieio; sync for creating bus-space memory barriers with sync.
sync performs a strict superset of the functions of eieio, so using both
is redundant. While here, expand bus barriers to all bus_space operations,
since many drivers do not correctly use bus_space_barrier().

In principle, we can also replace sync just with eieio, for a significant
performance increase, but it remains to be seen whether any poorly-written
drivers currently depend on the side effects of sync to properly function.

MFC after: 1 week


234576 22-Apr-2012 nwhitehorn

Avoid a lock order reversal in pmap_extract_and_hold() from relocking
the page. This PMAP requires an additional lock besides the PMAP lock
in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release.

Suggested by: kib
MFC after: 4 days


234542 21-Apr-2012 nwhitehorn

Organize some members of ucontext_t in the same order they are in the
trap frame. These are usually not used, and so this changes very little.

MFC after: 5 days


234517 20-Apr-2012 nwhitehorn

Make sure all pending operations have completed on the existing thread
before (potentially) migrating it to a different CPU.

MFC after: 5 days


234156 11-Apr-2012 nwhitehorn

We don't need kcopy() in any of the remaining places it is used, so
remove it.

MFC after: 2 weeks


234155 11-Apr-2012 nwhitehorn

Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy
for whether the page is physical. On dense phys mem systems (32-bit),
VM_PHYS_TO_PAGE will not return NULL for device memory pages if device
memory is above physical memory even if there is no allocated vm_page.
Attempting to use the returned page could then cause either memory
corruption or a page fault.


234149 11-Apr-2012 nwhitehorn

Fix error in r233949. Synchronizing icaches on uncacheable pages turns out
not to be a good idea, and of course the PV entry list for a page is never
empty after the page has been mapped.


234115 11-Apr-2012 nwhitehorn

Do not restore the register holding the TLS pointer when doing various
usermode context switches (long jumps and ucontext operations). If these
are used across threads, multiple threads can end up with the same TLS base.
Madness will then result.

This makes behavior on PPC match that on x86 systems and on Linux.

MFC after: 10 days


233964 06-Apr-2012 nwhitehorn

Execute an initial ptesync if and only if the PTE is actually being
invalidated, as opposed to a ref/changed bit update.


233957 06-Apr-2012 nwhitehorn

Substantially reduce the scope of the locks held in pmap_enter(), which
improves concurrency slightly.


233949 06-Apr-2012 nwhitehorn

Reduce the frequency that the PowerPC/AIM pmaps invalidate instruction
caches, by invalidating kernel icaches only when needed and not flushing
user caches for shared pages.

Suggested by: kib
MFC after: 2 weeks


233948 06-Apr-2012 nwhitehorn

Give the kernel pmap lock a different name than user pmap locks. It has
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.

MFC after: 5 days


233671 29-Mar-2012 jhb

- Rename VM_MEMATTR_UNCACHED to VM_MEMATTR_WEAK_UNCACHEABLE on x86 to
be less ambiguous and more clearly identify what it means. This
attribute is what Intel refers to as UC-, and it's only difference
relative to normal UC memory is that a WC MTRR will override a UC-
PAT entry causing the memory to be treated as WC, whereas a UC PAT
entry will always override the MTRR.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.


233635 29-Mar-2012 nwhitehorn

Allow multiple inclusion of trap.h. This has always been broken, but
until recently never caused problems.


233628 28-Mar-2012 fabient

Add software PMC support.

New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ
MFC after: 1 month


233618 28-Mar-2012 nwhitehorn

More PMAP performance improvements: skip 256 MB segments entirely if they
are are not mapped during ranged operations and reduce the scope of the
tlbie lock only to the actual tlbie instruction instead of the entire
sequence. There are a few more optimization possibilities here as well.


233530 27-Mar-2012 nwhitehorn

Make sure to call vm_page_dirty() before the pmap lock is released to
prevent a race where another process could conclude the page was clean.

Submitted by: alc


233529 27-Mar-2012 nwhitehorn

More PMAP concurrency improvements: replace the table lock and (almost) all
uses of the page queues mutex with a new rwlock that protects the page
table and the PV lists. This reduces system time during a parallel
buildworld by 35%.

Reviewed by: alc


233454 25-Mar-2012 nwhitehorn

More PMAP performance improvements: on powerpc64, when TLBIE can be run
with exceptions enabled, leave them enabled and use a regular mutex to
guard TLB invalidations instead of a spinlock.


233436 24-Mar-2012 nwhitehorn

Only call vm_page_dirty() on pages that are writable in order not to
confuse the VM.


233434 24-Mar-2012 nwhitehorn

Following suggestions from alc, skip wired mappings in pmap_remove_pages()
and remove moea64_attr_*() in favor of direct calls to vm_page_dirty()
and friends.


233271 21-Mar-2012 ed

Remove pty(4) from our kernel configurations.

As of FreeBSD 8, this driver should not be used. Applications that use
posix_openpt(2) and openpty(3) use the pts(4) that is built into the
kernel unconditionally. If it turns out high profile depend on the
pty(4) module anyway, I'd rather get those fixed. So please report any
issues to me.

The pty(4) module is still available as a kernel module of course, so a
simple `kldload pty' can be used to run old-style pseudo-terminals.


233188 19-Mar-2012 andreast

Provide a fix for certain PowerMacs where the U3 i2c lacks the interrupt
info.

Tested by: Robert Hish
MFC after: 1 week


233117 18-Mar-2012 nwhitehorn

Remove acquisition of VM page queues lock from pmap_protect(). Any actual
manipulation of the pvo_vlink and pvo_olink entries is already protected
by the table lock, so most remaining instances of the acquisition of the
page queues lock can likely be replaced with the table lock, or removed
if the table lock is already held.

Reviewed by: alc


233018 15-Mar-2012 nwhitehorn

Make ofw_bus_get_node() consistently return -1 when there is no associated
OF node, instead of a random mixture of 0 and -1. Update all checks for 0
to check for -1 instead.

MFC after: 4 weeks


233017 15-Mar-2012 nwhitehorn

Implement pmap_remove_pages(). This will be added later to the 32-bit MMU
module.

Suggested by: alc


233011 15-Mar-2012 nwhitehorn

Improve algorithm for deciding whether to loop through all process pages
or look them up individually in pmap_remove() and apply the same logic
in the other ranged operation (pmap_protect). This speeds up make
installworld by a factor of 2 on powerpc64.

MFC after: 1 week


232980 14-Mar-2012 nwhitehorn

Use LIST_FOREACH_SAFE() instead of LIST_FOREACH() in pmap_remove(), since
the point of this loop is to remove elements. This worked by accident before.

MFC after: 2 days


232745 09-Mar-2012 dim

Add casts to __uint16_t to the __bswap16() macros on all arches which
didn't already have them. This is because the ternary expression will
return int, due to the Usual Arithmetic Conversions. Such casts are not
needed for the 32 and 64 bit variants.

While here, add additional parentheses around the x86 variant, to
protect against unintended consequences.

MFC after: 2 weeks


232619 06-Mar-2012 attilio

Disable the option VFS_ALLOW_NONMPSAFE by default on all the supported
platforms.
This will make every attempt to mount a non-mpsafe filesystem to the
kernel forbidden, unless it is expressely compiled with
VFS_ALLOW_NONMPSAFE option.

This patch is part of the effort of killing non-MPSAFE filesystems
from the tree.

No MFC is expected for this patch.


232488 04-Mar-2012 andreast

Restore proper dot symbol creation for assembly files in the kernel build case.
Without this patch we were not able to see the assembly function.
Only the function descriptor was visible.

- Distinguish between user-land and kernel when creating the ENTRY() point of
assembly source.
- Make the ENTRY() macro more readable, replace the .align directive with the
gas platform independant .p2align directive.
- Create an END()macro for later use to provide traceback tables on powerpc64.


232482 04-Mar-2012 andreast

Add support for PWM controlled fans. I found these fans on my PowerMac9,1.
These fans are not located under the same node as the the RPM controlled ones,
So I had to adapt the current source to parse and fill the properties correctly.
To control the fans we can set the PWM ratio via sysctl between 20 and 100%.

Tested by: nwhitehorn
MFC after: 3 weeks


232403 02-Mar-2012 jhb

- Add a bus_dma tag to each PCI bus that is a child of a Host-PCI bridge.
The tag enforces a single restriction that all DMA transactions must not
cross a 4GB boundary. Note that while this restriction technically only
applies to PCI-express, this change applies it to all PCI devices as it
is simpler to implement that way and errs on the side of caution.
- Add a softc structure for PCI bus devices to hold the bus_dma tag and
a new pci_attach_common() routine that performs actions common to the
attach phase of all PCI bus drivers. Right now this only consists of
a bootverbose printf and the allocate of a bus_dma tag if necessary.
- Adjust all PCI bus drivers to allocate a PCI bus softc and to call
pci_attach_common() from their attach routines.

MFC after: 2 weeks


232356 01-Mar-2012 jhb

- Change contigmalloc() to use the vm_paddr_t type instead of an unsigned
long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
constraints.

These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.

Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.

Reviewed by: scottl


232177 26-Feb-2012 jhibbits

Add backlight control to ATI-graphics PowerBooks and iBooks.

Approved by: nwhitehorn (mentor)
MFC after: 1 week


231908 19-Feb-2012 andreast

Enable the new PCI-PCI bridge driver by default.
Tested on 32- and 64-bit PowerMac.


231770 15-Feb-2012 nwhitehorn

Improve error handling in smusat(4).

MFC after: 4 days


231149 07-Feb-2012 nwhitehorn

The bus resource adjustment API is not meant to work on active resources.
Return an error if a driver attempts this, and, if INVARIANTS is on,
panic.

Reviewed by: jhb


231046 05-Feb-2012 nwhitehorn

Inherit from PCI bridge driver instead of manually specifying all of its
methods.

Obtained from: sparc64
MFC after: 1 week


231044 05-Feb-2012 andreast

Replace the assembler macro WEAK_ALIAS with a new macro WEAK_REFERENCE which
has the same API as __weak_reference(). Give 'x' in SYS.h a more meaningful
name.

Tested on 32- and 64-bit PowerMac.

Reviewed by: bde


231026 05-Feb-2012 nwhitehorn

Make sure to remap adjusted resources.


231019 05-Feb-2012 andreast

Revert the _NOPROF entries on cpu_throw, cpu_switch and savectx. They can be
profiled too now.

MFC after: 2 weeks


231003 05-Feb-2012 nwhitehorn

Add support for bus_adjust_resource() on all PowerPC/AIM PCI bridges. With
this change, NEW_PCIB appears to work without incident at least on a G5
iMac. More testing will be required before it is turned on in GENERIC.


230999 04-Feb-2012 nwhitehorn

Compatibility with IBM firmware.


230994 04-Feb-2012 nwhitehorn

Miffed r230993 due to a one-character typo while reviewing the patch.


230993 04-Feb-2012 nwhitehorn

Unify OF PCI infrastructure, including changing from parsing the device
tree based on heuristics to parsing it based on the spec. This should also
lay the foundation for NEW_PCIB on PowerPC.

MFC after: 3 months


230992 04-Feb-2012 nwhitehorn

Avoid warnings about duplicate modules.

MFC after: 2 weeks


230779 30-Jan-2012 kib

Fix build for the case of powerpc64 kernel without COMPAT_FREEBSD32.

MFC after: 2 months


230767 30-Jan-2012 kib

Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit
and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported,
and some variants of powerpc32.

MFC after: 2 months (if ever)


230475 23-Jan-2012 das

Add C11 macros describing subnormal numbers to float.h.

Reviewed by: bde


230400 20-Jan-2012 andreast

This commit adds profiling support for powerpc64. Now we can do application
profiling and kernel profiling. To enable kernel profiling one has to build
kgmon(8). I will enable the build once I managed to build and test powerpc
(32-bit) kernels with profiling support.

- add a powerpc64 PROF_PROLOGUE for _mcount.
- add macros to avoid adding the PROF_PROLOGUE in certain assembly entries.
- apply these macros where needed.
- add size information to the MCOUNT function.

MFC after: 3 weeks, together with r230291


230398 20-Jan-2012 nwhitehorn

Prevent an error resulting from signed/unsigned comparison on systems
that do not comply with the OF spec.

Submitted by: Anders Gavare
MFC after: 1 week


230366 20-Jan-2012 das

Add parentheses where required. Without them, `sizeof LDBL_MAX'
is a syntax error and shouldn't be, while `1 FLT_ROUNDS' isn't a
syntax error and should be. Thanks to bde for the examples.


230229 16-Jan-2012 das

Fix the value of float_t to match what is implied by FLT_EVAL_METHOD.


230228 16-Jan-2012 das

Change the definition of FLT_EVAL_METHOD from 1 to 0. A value of 1 implies
that the compiler promotes floats to double precision in computations, but
inspection of the output of a cross-compiler indicates that this isn't the
case on powerpc.


230144 15-Jan-2012 nwhitehorn

Pick a constant high IRQ value for the PS3 IPI, which lets PS3 devices be
usefully loaded and unloaded as modules.

Submitted by: geoffrey dot levand at mail dot ru
MFC after: 2 weeks


230139 15-Jan-2012 nwhitehorn

Now that we can tolerate LPAR context switches on the PS3 hypervisor, going
to hypervisor-idle on both threads will not hang the kernel.


230123 15-Jan-2012 nwhitehorn

Rework SLB trap handling so that double-faults into an SLB trap handler are
possible, and double faults within an SLB trap handler are not. The result
is that it possible to take an SLB fault at any time, on any address, for
any reason, at any point in the kernel.

This lets us do two important things. First, it removes the (soft) 16 GB RAM
ceiling on PPC64 as well as any architectural limitations on KVA space.
Second, it lets the kernel tolerate poorly designed hypervisors that
have a tendency to fail to restore the SLB properly after a hypervisor
context switch.

MFC after: 6 weeks


230035 12-Jan-2012 jhibbits

Add PWM monitoring sysctl to G4 MDD (Windtunnel) fan driver. While there, clean
up some style nits.

Approved by: nwhitehorn (mentor)
MFC after: 3 days


229967 11-Jan-2012 nwhitehorn

Add a memory barrier to bus_dmamap_sync(), as should have always been
present. We need a sync instead of eieio, as eieio does not enforce storage
ordering between main and device memory.


229660 05-Jan-2012 andreast

Fix build on powerpc64 too. The same as r229640.


229640 05-Jan-2012 adrian

Fix build.


229494 04-Jan-2012 andreast

Introduce internal macros for __U/INT64_C to define the U/INT64_MAX/MIN
values properly. The previous definition only worked if __STDC_LIMIT_MACROS
and __STDC_CONSTANT_MACROS were defined at the same time.


228973 29-Dec-2011 rwatson

Add "options CAPABILITY_MODE" and "options CAPABILITIES" to GENERIC kernel
configurations for various architectures in FreeBSD 10.x. This allows
basic Capsicum functionality to be used in the default FreeBSD
configuration on non-embedded architectures; process descriptors are not
yet enabled by default.

MFC after: 3 months
Sponsored by: Google, Inc


228869 24-Dec-2011 jhibbits

Implement hwpmc counting PMC support for PowerPC G4+ (MPC745x/MPC744x).
Sampling is in progress.

Approved by: nwhitehorn (mentor)
MFC after: 9.0-RELEASE


228689 18-Dec-2011 nwhitehorn

Support infrastructure for X11 on PS3.

Submitted by: geoffrey dot levand at mail dot ru
MFC after: 1 week


228688 18-Dec-2011 nwhitehorn

Add version header to output file.


228631 17-Dec-2011 avg

kern cons: introduce infrastructure for console grabbing by kernel

At the moment grab and ungrab methods of all console drivers are no-ops.

Current intended meaning of the calls is that the kernel takes control of
console input. In the future the semantics may be extended to mean that
the calling thread takes full ownership of the console (e.g. console
output from other threads could be suspended).

Inspired by: bde
MFC after: 2 months


228609 17-Dec-2011 nwhitehorn

Allow this to work on embedded systems without Open Firmware by making
lack of a /chosen non-fatal, and manually removing memory in use by the
kernel from the physical memory map.

Submitted by: rpaulo


228605 16-Dec-2011 nwhitehorn

Zero BSS on start, in case the ELF loader that started the kernel did not
do this for us. This can happen on some embedded systems.

Submitted by: rpaulo


228522 15-Dec-2011 alc

Eliminate vestiges of page coloring.


228483 14-Dec-2011 hselasky

Implement better support for USB controller suspend and resume.

This patch should remove the need for kldunload of USB
controller drivers at suspend and kldload of USB controller
drivers at resume.

This patch also fixes some build issues in avr32dci.c

MFC after: 2 weeks


228469 13-Dec-2011 ed

Replace __signed by signed.

The signed keyword is an integral part of the C syntax. There's no need
to use __signed.


228413 11-Dec-2011 nwhitehorn

Increase the available virtual address space for user programs on PowerPC
AIM systems to 4 GB on 32-bit systems and 2^64 bytes on 64-bit systems.
VM_MAXUSER_ADDRESS remains at 2 GB on pending Book-E, pending review of
an increase to 3 GB by those more familiar with Book-E.


228412 11-Dec-2011 nwhitehorn

Keep track of PVO entries in each pmap, which allows much faster
pmap_remove() for large sparse requests. This can prevent pmap_remove()
operations on 64-bit process destruction or swapout that would take
several hundred times the lifetime of the universe to complete. This
behavior is largely indistinguishable from a hang.


228277 05-Dec-2011 jhibbits

Fix style(9) issues from r228270.

Approved by: nwhitehorn (mentor)


228270 05-Dec-2011 jhibbits

Add a devd notification for closing/opening the lid on PowerBooks and iBooks.

Approved by: nwhitehorn (mentor)


228201 02-Dec-2011 jchandra

Fix OF_finddevice error return value in case of FDT.

According to the open firmware standard, finddevice call has to return
a phandle with value of -1 in case of error.

This commit is to:
- Fix the FDT implementation of this interface (ofw_fdt_finddevice) to
return (phandle_t)-1 in case of error, instead of 0 as it does now.
- Fix up the callers of OF_finddevice() to compare the return value with
-1 instead of 0 to check for errors.
- Since phandle_t is unsigned, the return value of OF_finddevice should
be checked with '== -1' rather than '<= 0' or '> 0', fix up these cases
as well.

Reported by: nwhitehorn

Reviewed by: raj
Approved by: raj, nwhitehorn


227843 22-Nov-2011 marius

- There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.


227779 21-Nov-2011 nwhitehorn

The PPC IRQ layer assumes that the IPI IRQ is the last IRQ on the PIC.
This assumption is invalid and the code should be fixed, but humor it for
now and set the "IPI" for PS3s in the non-SMP case to a large number. This
fixes boot with a non-SMP kernel.

Submitted by: geoffrey dot levand at mail dot ru
MFC after: 1 week


227628 17-Nov-2011 nwhitehorn

Use a global __pure2 function instead of a global register variable for
curthread, like on x86 and sparc64. This makes the kernel somewhat more
clang friendly, which doesn't support global register variables.


227627 17-Nov-2011 nwhitehorn

Add an extra invariant here which was useful on 64-bit CPUs.


227568 16-Nov-2011 alc

Refactor the code that performs physically contiguous memory allocation,
yielding a new public interface, vm_page_alloc_contig(). This new function
addresses some of the limitations of the current interfaces, contigmalloc()
and kmem_alloc_contig(). For example, the physically contiguous memory that
is allocated with those interfaces can only be allocated to the kernel vm
object and must be mapped into the kernel virtual address space. It also
provides functionality that vm_phys_alloc_contig() doesn't, such as wiring
the returned pages. Moreover, unlike that function, it respects the low
water marks on the paging queues and wakes up the page daemon when
necessary. That said, at present, this new function can't be applied to all
types of vm objects. However, that restriction will be eliminated in the
coming weeks.

From a design standpoint, this change also addresses an inconsistency
between vm_phys_alloc_contig() and the other vm_phys_alloc*() functions.
Specifically, vm_phys_alloc_contig() manipulated vm_page fields that other
functions in vm/vm_phys.c didn't. Moreover, vm_phys_alloc_contig() knew
about vnodes and reservations. Now, vm_page_alloc_contig() is responsible
for these things.

Reviewed by: kib
Discussed with: jhb


227537 15-Nov-2011 marius

As it turns out, r186347 actually is insufficient to avoid the use of the
curthread-accessing part of mtx_{,un}lock(9) when using a r210623-style
curthread implementation on sparc64, crashing the kernel in its early
cycles as PCPU isn't set up, yet (and can't be set up as OFW is one of the
things we need for that, which leads to a chicken-and-egg problem). What
happens is that due to the fact that the idea of r210623 actually is to
allow the compiler to cache invocations of curthread, it factors out
obtaining curthread needed for both mtx_lock(9) and mtx_unlock(9) to
before the branch based on kobj_mutex_inited when compiling the kernel
without the debugging options. So change kobj_class_compile_static(9)
to just never acquire kobj_mtx, effectively restricting it to its
documented use, and add a kobj_init_static(9) for initializing objects
using a class compiled with the former and that also avoids using mutex(9)
(and malloc(9)). Also assert in both of these functions that they are
used in their intended way only.
While at it, inline kobj_register_method() and kobj_unregister_method()
as there wasn't much point for factoring them out in the first place
and so that a reader of the code has to figure out the locking for
fewer functions missing a KOBJ_ASSERT.
Tested on powerpc{,64} by andreast.

Reviewed by: nwhitehorn (earlier version), jhb
MFC after: 3 days


227386 09-Nov-2011 nwhitehorn

Fix a bug where the pmap_cpu_bootstrap() ap argument could be clobbered.
Luckily, it mostly wasn't important, so this didn't cause major problems.
Also improve register reuse when setting up trap frames very slightly.

Submitted by: Justin Hibbits <chmeeedalf at gmail dot com>
MFC after: 5 days


227333 08-Nov-2011 attilio

Introduce the option VFS_ALLOW_NONMPSAFE and turn it on by default on
all the architectures.
The option allows to mount non-MPSAFE filesystem. Without it, the
kernel will refuse to mount a non-MPSAFE filesytem.

This patch is part of the effort of killing non-MPSAFE filesystems
from the tree.

No MFC is expected for this patch.

Tested by: gianni
Reviewed by: kib


227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


227293 07-Nov-2011 ed

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


226835 27-Oct-2011 kensmith

Adjust the debugger options slightly. This should help me do the right
thing when changing the debugging options as part of head becoming a new
stable branch. It may also help people who for one reason or another want
to run head but don't want it slowed down by the debugging support.

Reviewed by: kib


226607 21-Oct-2011 das

People porting FreeBSD to new architectures ought not have to
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.

Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.


226547 19-Oct-2011 kensmith

Add a warning about why sbp(4) is commented out so that curious folks
are forewarned they might wind up with a hole in their foot if they
decide to give it a try.

Suggested by: dougb


226410 15-Oct-2011 nwhitehorn

Enforce a memory barrier in stream operations, as is done on other
bus_space calls. This makes ath(4) work correctly on PowerPC.

Submitted by: adrian
Tested by: andreast
MFC after: 3 days


226112 07-Oct-2011 kib

Remove unused define.

MFC after: 1 month


225953 03-Oct-2011 mav

Revert r225875, r225877:
It is reported that on some chips (e.g. the 970MP) behavior of POW bit set
simultaneously with modifying other bits is undefined and may cause hangs.
The race should be handled in some other way, but for now just get back.

Reported by: nwitehorn


225950 03-Oct-2011 ken

Add descriptor sense support to CAM, and honor sense residuals properly in
CAM.

Desriptor sense is a new sense data format that originated in SPC-3. Among
other things, it allows for an 8-byte info field, which is necessary to
pass back block numbers larger than 4 bytes.

This change adds a number of new functions to scsi_all.c (and therefore
libcam) that abstract out most access to sense data.

This includes a bump of CAM_VERSION, because the CCB ABI has changed.
Userland programs that use the CAM pass(4) driver will need to be
recompiled.

camcontrol.c: Change uses of scsi_extract_sense() to use
scsi_extract_sense_len().

Use scsi_get_sks() instead of accessing sense key specific
data directly.

scsi_modes: Update the control mode page to the latest version (SPC-4).

scsi_cmds.c,
scsi_target.c: Change references to struct scsi_sense_data to struct
scsi_sense_data_fixed. This should be changed to allow the
user to specify fixed or descriptor sense, and then use
scsi_set_sense_data() to build the sense data.

ps3cdrom.c: Use scsi_set_sense_data() instead of setting sense data
manually.

cam_periph.c: Use scsi_extract_sense_len() instead of using
scsi_extract_sense() or accessing sense data directly.

cam_ccb.h: Bump the CAM_VERSION from 0x15 to 0x16. The change of
struct scsi_sense_data from 32 to 252 bytes changes the
size of struct ccb_scsiio, but not the size of union ccb.
So the version must be bumped to prevent structure
mis-matches.

scsi_all.h: Lots of updated SCSI sense data and other structures.

Add function prototypes for the new sense data functions.

Take out the inline implementation of scsi_extract_sense().
It is now too large to put in a header file.

Add macros to calculate whether fields are present and
filled in fixed and descriptor sense data

scsi_all.c: In scsi_op_desc(), allow the user to pass in NULL inquiry
data, and we'll assume a direct access device in that case.

Changed the SCSI RESERVED sense key name and description
to COMPLETED, as it is now defined in the spec.

Change the error recovery action for a number of read errors
to prevent lots of retries when the drive has said that the
block isn't accessible. This speeds up reconstruction of
the block by any RAID software running on top of the drive
(e.g. ZFS).

In scsi_sense_desc(), allow for invalid sense key numbers.
This allows calling this routine without checking the input
values first.

Change scsi_error_action() to use scsi_extract_sense_len(),
and handle things when invalid asc/ascq values are
encountered.

Add a new routine, scsi_desc_iterate(), that will call the
supplied function for every descriptor in descriptor format
sense data.

Add scsi_set_sense_data(), and scsi_set_sense_data_va(),
which build descriptor and fixed format sense data. They
currently default to fixed format sense data.

Add a number of scsi_get_*() functions, which get different
types of sense data fields from either fixed or descriptor
format sense data, if the data is present.

Add a number of scsi_*_sbuf() functions, which print
formatted versions of various sense data fields. These
functions work for either fixed or descriptor sense.

Add a number of scsi_sense_*_sbuf() functions, which have a
standard calling interface and print the indicated field.
These functions take descriptors only.

Add scsi_sense_desc_sbuf(), which will print a formatted
version of the given sense descriptor.

Pull out a majority of the scsi_sense_sbuf() function and
put it into scsi_sense_only_sbuf(). This allows callers
that don't use struct ccb_scsiio to easily utilize the
printing routines. Revamp that function to handle
descriptor sense and use the new sense fetching and
printing routines.

Move scsi_extract_sense() into scsi_all.c, and implement it
in terms of the new function, scsi_extract_sense_len().
The _len() version takes a length (which should be the
sense length - residual) and can indicate which fields are
present and valid in the sense data.

Add a couple of new scsi_get_*() routines to get the sense
key, asc, and ascq only.

mly.c: Rename struct scsi_sense_data to struct
scsi_sense_data_fixed.

sbp_targ.c: Use the new sense fetching routines to get sense data
instead of accessing it directly.

sbp.c: Change the firewire/SCSI sense data transformation code to
use struct scsi_sense_data_fixed instead of struct
scsi_sense_data. This should be changed later to use
scsi_set_sense_data().

ciss.c: Calculate the sense residual properly. Use
scsi_get_sense_key() to fetch the sense key.

mps_sas.c,
mpt_cam.c: Set the sense residual properly.

iir.c: Use scsi_set_sense_data() instead of building sense data by
hand.

iscsi_subr.c: Use scsi_extract_sense_len() instead of grabbing sense data
directly.

umass.c: Use scsi_set_sense_data() to build sense data.

Grab the sense key using scsi_get_sense_key().

Calculate the sense residual properly.

isp_freebsd.h: Use scsi_get_*() routines to grab asc, ascq, and sense key
values.

Calculate and set the sense residual.

MFC after: 3 days
Sponsored by: Spectra Logic Corporation


225877 29-Sep-2011 mav

Add header missed in r225875.

MFC after: 3 days


225875 29-Sep-2011 mav

Handle the race in cpu_idle() when due to the critical section CPU could get
into sleep after receiving interrupt, delaying interrupt thread execution
indefinitely until the next interrupt arrive.

Reviewed by: nwhitehorn
MFC after: 3 days


225841 28-Sep-2011 kib

Remove locking of the vm page queues from several pmaps, which only
protected the dirty mask updates. The dirty mask updates are handled
by atomics after the r225840.

Submitted by: alc
Tested by: flo (sparc64)
MFC after: 2 weeks


225617 16-Sep-2011 kmacy

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


225474 11-Sep-2011 kib

Inline the syscallenter() and syscallret(). This reduces the time measured
by the syscall entry speed microbenchmarks by ~10% on amd64.

Submitted by: jhb
Approved by: re (bz)
MFC after: 2 weeks


225418 06-Sep-2011 kib

Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic
flags field. Updates to the atomic flags are performed using the atomic
ops on the containing word, do not require any vm lock to be held, and
are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9)
functions are provided to modify afalgs.

Document the changes to flags field to only require the page lock.

Introduce vm_page_reference(9) function to provide a stable KPI and
KBI for filesystems like tmpfs and zfs which need to mark a page as
referenced.

Reviewed by: alc, attilio
Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64)
Approved by: re (bz)


225214 27-Aug-2011 rwatson

Follow up to r225203 refining break-to-debugger run-time configuration
improvements:

(1) Implement new model in previously missed at91 UART driver
(2) Move BREAK_TO_DEBUGGER and ALT_BREAK_TO_DEBUGGER from opt_comconsole.h
to opt_kdb.h (spotted by np)
(3) Garbage collect now-unused opt_comconsole.h

MFC after: 3 weeks
Approved by: re (bz)


225203 26-Aug-2011 rwatson

Attempt to make break-to-debugger and alternative break-to-debugger more
accessible:

(1) Always compile in support for breaking into the debugger if options
KDB is present in the kernel.

(2) Disable both by default, but allow them to be enabled via tunables
and sysctls debug.kdb.break_to_debugger and
debug.kdb.alt_break_to_debugger.

(3) options BREAK_TO_DEBUGGER and options ALT_BREAK_TO_DEBUGGER continue
to behave as before -- only now instead of compiling in
break-to-debugger support, they change the default values of the
above sysctls to enable those features by default. Current kernel
configurations should, therefore, continue to behave as expected.

(4) Migrate alternative break-to-debugger state machine logic out of
individual device drivers into centralised KDB code. This has a
number of upsides, but also one downside: it's now tricky to release
sio spin locks when entering the debugger, so we don't. However,
similar logic does not exist in other device drivers, including uart.

(5) dcons requires some special handling; unlike other console types, it
allows overriding KDB's own debugger selection, so we need a new
interface to KDB to allow that to work.

GENERIC kernels in -CURRENT will now support break-to-debugger as long as
appropriate boot/run-time options are set, which should improve the
debuggability of BETA kernels significantly.

MFC after: 3 weeks
Reviewed by: kib, nwhitehorn
Approved by: re (bz)


224857 14-Aug-2011 nwhitehorn

Add support for the Blu-Ray drive found in the Sony Playstation 3 and fix
some realted minor bugs in PS3 internal storage support.

Submitted by: glevand <geoffrey.levand@mail.ru>
Approved by: re (bz)


224746 09-Aug-2011 kib

- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag
to VPO_UNMANAGED (and also making the flag protected by the vm object
lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
VPO_UNMANAGED. As a consequence, pmap code now can use use just
VPO_UNMANAGED to decide whether the page is unmanaged.

Reviewed by: alc
Tested by: pho (x86, previous version), marius (sparc64),
marcel (arm, ia64, powerpc), ray (mips)
Sponsored by: The FreeBSD Foundation
Approved by: re (bz)


224618 02-Aug-2011 marcel

Cross a T and dot an I:
o Fix awkward use of braces in combination with mis-indentation.
A mistake, that happened to yield the right behaviour?
o Fix typo in comment.

No functional change.

Approved by: re (blanket)


224617 02-Aug-2011 marcel

It's invalid to use GLOBAL() for kernload_ap, as the macro switches
to the .data section. We need kernload_ap in the boot page.

Approved by: re (blanket)


224616 02-Aug-2011 marcel

There's no ':' after GLOBAL(). Missed due to no SMP testing.

Approved by: re (blanket)


224611 02-Aug-2011 marcel

Add support for Juniper's loader. The difference between FreeBSD's and
Juniper's loader is that Juniper's loader maps all of the kernel and
preloaded modules at the right virtual address before jumping into the
kernel. FreeBSD's loader simply maps 16MB using the physical address
and expects the kernel to jump through hoops to relocate itself to
it's virtual address. The problem with the FreeBSD loader's approach is
that it typically maps too much or too little. There's no harm if it's
too much (other than wasting space), but if it's too little then the
kernel will simply not boot, because the first thing the kernel needs
is the bootinfo structure, which is never mapped in that case. The page
fault that early is fatal.

The changes constitute:
1. Do not remap the kernel in locore.S. We're mapped where we need to
be so we can pretty much call into C code after setting up the
stack.
2. With kernload and kernload_ap not set in locore.S, we need to set
them in pmap.c: kernload gets defined when we preserve the TLB1.
Here we also determine the size of the kernel mapped. kernload_ap
is set first thing in the pmap_bootstrap() method.
3. Fix tlb1_map_region() and its use to properly externd the mapped
kernel size to include low-level data structures.

Approved by: re (blanket)
Obtained from: Juniper Networks, Inc


224553 31-Jul-2011 marcel

Apply r221124 to Book-E: switch to the new NFS client.

Approved by: re (blanket)


224552 31-Jul-2011 marcel

Fix r222813: we need to include sys/cpuset.h. because the PIC interface
uses cpuset_t. While here, fix the redundant inclusion of sys/bus.h and
order the includes.

Approved by: re (blanket)


224551 31-Jul-2011 marcel

Fix r224187: .word defines a 16-bit object and size_t is defined as
a 32-bit intergal. Use .long to define sintrcnt and sintrname.

Approved by: re (blanket)


224505 30-Jul-2011 nwhitehorn

Fix an error that could cause sysctl -a to enter an infinite loop in the
event of a broken or busy fan due to returning incorrect error codes from
the FCU sysctl handler.

Reported by: Path Mather <paul at gromit dot dlib dot vt dot edu>1
Approved by: re (kib)


224400 25-Jul-2011 andreast

This a follow up commit from r224216 for powerpc 32-bit. Increase
the storage size for sintrcnt/sintrnames to .long.

Reviewed by: nwhitehorn
Approved by: re (kib)


224216 19-Jul-2011 attilio

On 64 bit architectures size_t is 8 bytes, thus it should use an 8 bytes
storage.
Fix the sintrcnt/sintrnames specification.

No MFC is previewed for this patch.

Reported, reviewed and tested by: marcel
Approved by: re (kib)


224207 19-Jul-2011 attilio

Add the possibility to specify from kernel configs MAXCPU value.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.

No MFC is previewed for this patch.

Tested by: pluknet
Approved by: re (kib)


224187 18-Jul-2011 attilio

- Remove the eintrcnt/eintrnames usage and introduce the concept of
sintrcnt/sintrnames which are symbols containing the size of the 2
tables.
- For amd64/i386 remove the storage of intr* stuff from assembly files.
This area can be widely improved by applying the same to other
architectures and likely finding an unified approach among them and
move the whole code to be MI. More work in this area is expected to
happen fairly soon.

No MFC is previewed for this patch.

Tested by: pluknet
Reviewed by: jhb
Approved by: re (kib)


224019 14-Jul-2011 nwhitehorn

Enable PREEMPTION for PowerPC/AIM generic kernels. The last known PREEMPTION
bug on PowerPC was resolved by r223485, and it appears to run stably at this
point.


223792 05-Jul-2011 nwhitehorn

Follow Linux by unconditionally stripping the RX vlan tag from incoming
packets. It turns out that all firmware versions insert it, whether or not
they support VLAN tagging.

Submitted by: glevand <geoffrey.levand at mail dot ru>


223758 04-Jul-2011 attilio

With retirement of cpumask_t and usage of cpuset_t for representing a
mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.

Remove them and replace their usage with custom pc_cpuid magic (as,
atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and
pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).

This change is not targeted for MFC because of struct pcpu members
removal and dependency by cpumask_t retirement.

MD review by: marcel, marius, alc
Tested by: pluknet
MD testing by: marcel, marius, gonzo, andreast


223571 26-Jun-2011 nwhitehorn

Add better error handling for RTAS calls. These can potentially cause
machine checks (e.g. invalid PCI configuration cycles), but these can
be caught and recovered from. This change also the RTAS PCI driver to
work without modification as a replacement for the Grackle driver on
Grackle-based Powermacs.


223570 26-Jun-2011 nwhitehorn

Revert r223479. It is unnecessary and served only to slightly ameliorate
some manifestations of the bug actually fixed in r223485.


223555 26-Jun-2011 nwhitehorn

Turn the minimum PWM fan speed down to 30 from 40. It turns out the burning
smell that caused me to turn this up was due to a failed fan burning, not
a CPU (plus a healthy dose of paranoia).

Submitted by: Paul Mather <paul at gromit dot dlib dot vt dot edu>


223485 23-Jun-2011 nwhitehorn

Use the ABI-mandated thread pointer register (r2 for ppc32, r13 for ppc64)
instead of a PCPU field for curthread. This averts a race on SMP systems
with a high interrupt rate where the thread looking up the value of
curthread could be preempted and migrated between obtaining the PCPU
pointer and reading the value of pc_curthread, resulting in curthread being
observed to be the current thread on the thread's original CPU. This played
merry havoc with the system, in particular with mutexes. Many thanks to
jhb for helping me work this one out.

Note that Book-E is in principle susceptible to the same problem, but has
not been modified yet due to lack of Book-E hardware.

MFC after: 2 weeks


223479 23-Jun-2011 nwhitehorn

Clear any outstanding atomic reservations when traps are taken. This fixes
some interesting bugs (mostly on SMP systems) with atomic operations
silently failing in interrupt heavy situations, especially when using
overflow pages.


223471 23-Jun-2011 andreast

Fix merge typo.


223470 23-Jun-2011 andreast

Add leading zeros when printing the stackframe on __powerpc64__.


223463 23-Jun-2011 nwhitehorn

Use atomic operations to mask and unmask IRQs. This prevents a problem
(obvious in retrospect) in which interrupts on one CPU that are temporarily
masked can end up permanently masked when a handler on another CPU clobbers
the interrupt mask register with an old copy.


223462 23-Jun-2011 nwhitehorn

Use 4 KB pages for storage bus devices, which seems to be what the HV uses
internally.


223461 23-Jun-2011 nwhitehorn

Rework the PS3 disk driver to support NCQ and do its DMA a little
differently.


223460 23-Jun-2011 nwhitehorn

Add hypervisor call error codes.


223406 22-Jun-2011 nwhitehorn

This is more complicated than I expected. Storage devices need the IOMMU
set up, but must not use it.


223404 22-Jun-2011 nwhitehorn

The IOMMU is not involved for the storage bus.


223324 20-Jun-2011 nwhitehorn

Work/hack around some race conditions present in the hardware/HV interface.
Partially inspired by a patch from glevand (geoffrey.levand@mail.ru).


223316 20-Jun-2011 nwhitehorn

Make this slightly less yelly about regions that the hypervisor protects
from us by not registering them as disks.


223314 20-Jun-2011 nwhitehorn

Add an OHCI driver to complement the EHCI one. The infrastructure to attach
both to the parent ps3bus was in r223313. This driver itself comes from the
ps3 project branch.


223313 20-Jun-2011 nwhitehorn

Driver for PS3's internal hard disk. Hopefully this can be CAM-ified in
the future, but presents a set of simple block devices for now. With
(forthcoming) boot loader support or vfs.root.mountfrom, allows booting
PS3s from disk.

Submitted by: glevand <geoffrey.levand@mail.ru>


222982 11-Jun-2011 nwhitehorn

Follow up r222980 on PowerPC: add sound(4) and common device drivers
to PowerPC GENERIC (along with a small rearrangement).


222813 07-Jun-2011 attilio

etire the cpumask_t type and replace it with cpuset_t usage.

This is intended to fix the bug where cpu mask objects are
capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever
value. Anyway, as long as several structures in the kernel are
statically allocated and sized as MAXCPU, it is suggested to keep it
as low as possible for the time being.

Technical notes on this commit itself:
- More functions to handle with cpuset_t objects are introduced.
The most notable are cpusetobj_ffs() (which calculates a ffs(3)
for a cpuset_t object), cpusetobj_strprint() (which prepares a string
representing a cpuset_t object) and cpusetobj_strscan() (which
creates a valid cpuset_t starting from a string representation).
- pc_cpumask and pc_other_cpus are target to be removed soon.
With the moving from cpumask_t to cpuset_t they are now inefficient
and not really useful. Anyway, for the time being, please note that
access to pcpu datas is protected by sched_pin() in order to avoid
migrating the CPU while reading more than one (possible) word
- Please note that size of cpuset_t objects may differ between kernel
and userland. While this is not directly related to the patch itself,
it is good to understand that concept and possibly use the patch
as a reference on how to deal with cpuset_t objects in userland, when
accessing kernland members.
- KTR_CPUMASK is changed and now is represented through a string, to be
set as the example reported in NOTES.

Please additively note that no MAXCPU is bumped in this patch, but
private testing has been done until to MAXCPU=128 on a real 8x8x2(htt)
machine (amd64).

Please note that the FreeBSD version is not yet bumped because of
the upcoming pcpu changes. However, note that this patch is not
targeted for MFC.

People to thank for the time spent on this patch:
- sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested
several revision of the patches and really helped in improving
stability of this work.
- marius fixed several bugs in the sparc64 implementation and reviewed
patches related to ktr.
- jeff and jhb discussed the basic approach followed.
- kib and marcel made targeted review on some specific part of the
patch.
- marius, art, nwhitehorn and andreast reviewed MD specific part of
the patch.
- marius, andreast, gonzo, nwhitehorn and jceel tested MD specific
implementations of the patch.
- Other people have made contributions on other patches that have been
already committed and have been listed separately.

Companies that should be mentioned for having participated at several
degrees:
- Yahoo! for having offered the machines used for testing on big
count of CPUs.
- The FreeBSD Foundation for having sponsored my devsummit attendance,
which has been instrumental.
- Sandvine for having offered offices and infrastructure during
development.

(I really hope I didn't forget anyone, if it happened I apologize in
advance).


222686 04-Jun-2011 andreast

Add new fan controller driver for the G4 MDD PowerMac. Submitted and tested
by Justin Hibbits.

Approved by: nwhitehorn (mentor)


222675 04-Jun-2011 andreast

- Improve error handling.
- Add retry loops for the i2c read/write functions.

Approved by: nwhitehorn (mentor)


222667 04-Jun-2011 nwhitehorn

Retry the memory map-related portions of r222613, written by andreast,
after some minor tweaks and an increase in the early-boot stack space in
r222632.


222666 04-Jun-2011 nwhitehorn

Fix a typo derived from a mismerge from mmu_oea that would cause
pmap_sync_icache() to sync random (possibly uncached or nonexisting!)
memory, causing kernel page faults or machine checks, most easily
triggered by using GDB. While here, add an additional safeguard to only
sync cacheable memory.

MFC after: 2 days


222659 03-Jun-2011 andreast

- Introduce a define for ZERO_C_TO_K.
- Fix the printing of the temperature when we exceed the critical value.

Approved by: nwhitehorn (mentor)


222632 03-Jun-2011 nwhitehorn

Quantities stored on the stack on ppc64 tend to be twice as large as on
ppc32, so make the early stack correspondingly twice as big.


222622 02-Jun-2011 nwhitehorn

Temporarily back out those parts of r222613 related to parsing the memory
map. They cause non-understood boot failures on some Apple machines with
more than 2 GB of RAM (like my work desktop).


222620 02-Jun-2011 nwhitehorn

The POWER7 has only 32 SLB slots instead of 64, like other supported
64-bit PowerPC CPUs. Add infrastructure to support variable numbers of
SLB slots and move the user slot from 63 to 0, so that it is always
available.


222618 02-Jun-2011 nwhitehorn

If running under a hypervisor, don't yell at the user about starting
unknown CPU types, instead relying on the hypervisor to have given us a
reasonable environment.


222616 02-Jun-2011 nwhitehorn

Explicitly initialize the first thread's MSR to PSL_KERNSET.


222615 02-Jun-2011 nwhitehorn

Include the modules area in the mapped kernel code. This fixes the kernel's
access to modules and loader metadata when started from real mode, but
without a direct map.


222614 02-Jun-2011 nwhitehorn

Remove some dead code: unnecessary isyncs and memory sorting, which are
handled in mtmsr() and mem_regions(), respectively.


222613 02-Jun-2011 nwhitehorn

MFpseries:
Renovate and improve the AIM Open Firmware support:
- Add RTAS (Run-Time Abstraction Services) support, found on all IBM systems
and some Apple ones
- Improve support for 32-bit real mode Open Firmware systems
- Pull some more OF bits over from the AIM directory
- Fix memory detection on IBM LPARs and systems with more than one /memory
node (by andreast@)


222531 31-May-2011 nwhitehorn

On multi-core, multi-threaded PPC systems, it is important that the threads
be brought up in the order they are enumerated in the device tree (in
particular, that thread 0 on each core be brought up first). The SLIST
through which we loop to start the CPUs has all of its entries added with
SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration
and so AP startup would always fail in such situations (causing a machine
check or RTAS failure). Fix this by changing the SLIST into an STAILQ,
and inserting new CPUs at the end.

Reviewed by: jhb


222469 29-May-2011 nwhitehorn

Use kproc_exit() instead of returning from the management function on
systems with no manageable thermal control devices.


222463 29-May-2011 nwhitehorn

Add some error handling here: if a sensor returns an error code (a negative
Kelvin temperature, which is impossible except for some contrived magnetic
spin systems), use the previous measurement from that sensor instead of
corrupting everything and randomly changing the fans or shutting off the
machine.


222462 29-May-2011 nwhitehorn

Add the next digit of precision to temperatures, which I missed when
converting the reporting format from degrees C to 0.1 degree K.


222460 29-May-2011 nwhitehorn

Don't put negative values into the averages.


222458 29-May-2011 nwhitehorn

Update the I2C-based temperature/fan drivers to connect to the Powermac
thermal control module. This provides automatic fan management on all G5
PowerMacs and Xserves.


222449 29-May-2011 andreast

Add a new driver, the ad7417, to read temperatures and voltages on some
PowerMac's.

Approved by: nwhitehorn (mentor)


222434 29-May-2011 marcel

The P4080 has 8 cores. Bump MAXCPU to 8 to match.


222433 29-May-2011 marcel

o Add system versions for the P4040(E) and P4080(E).
o In bare_probe(), change the logic that determines the maximum
number of processors/cores into a switch statement and take
advantage of the fact that bit 3 of the SVR value indicates
whether we're running on a security enabled version. Since we
don't care about that here, mask the bit. All -E versions
are taken care of automatically.


222431 28-May-2011 nwhitehorn

Adapt smusat(4) to use powermac_thermal. This provides automatic fan
management on dual- and quad-core Powermac G5s, and the last G5 iMacs.


222430 28-May-2011 nwhitehorn

Require an error instead of a timeout to decide the new-style fan
commands won't work. This prevents a busy system from making smu(4)
suddenly decide its fans use the old-style command set.

MFC after: 3 days


222429 28-May-2011 nwhitehorn

Factor out the SMU fan management code into a new module (powermac_thermal)
that will connect all of the various sensors and fan control modules on
Apple hardware with software-controlled fans (e.g. all G5 systems).

MFC after: 1 month


222428 28-May-2011 marcel

o Determine the number of LAWs in a way the is future proof. Only the
MPC8555(E) has 8 LAWs, so don't make that the default case. Current
processors have 12 LAWs so use that as the default instead.
o Determine the target ID of the PCI/PCI-X and PCI-E controllers in
a way that's more future proof. There's almost a perfect mapping
from HC register offset to target ID, so use that as the default.
Handle the MPC8548(E) specially, since it has a non-standard target
ID for the PCI-E controller. Don't worry about whether the processor
implements the target ID here, because we should not get called for
PCI/PCI-X or PCI-E host controllers that don't exist.


222426 28-May-2011 marcel

Remove unused defines. They're distracting...


222400 28-May-2011 marcel

Better support different kernel hand-offs. When loaded directly
from U-Boot, the kernel is passed a standard argc/argv pair.
The Juniper loader passes the metadata pointer as the second
argument and passes 0 in the first. The FreeBSD loader passes
the metadata pointer in the first argument.

As such, have locore preserve the first 2 arguments in registers
r30 & r31. Change e500_init() to accept these arguments. Don't
pass global offsets (i.e. kernel_text and _end) as arguments to
e500_init(). We can reference those directly.

Rename e500_init() to booke_init() now that we're changing the
prototype.

In booke_init(), "decode" arg1 and arg2 to obtain the metadata
pointer correctly. For the U-Boot case, clear SBSS and BSS and
bank on having a static FDT for now. This allows loading the
ELF kernel and jumping to the entry point without trampoline.


222392 27-May-2011 marcel

o The P1020(E) & P2020(E) also have two cores. This conditional has
a tendency to grow unwieldy so we may want to revisit this in due
time.
o Simplify the CPU reset function by writing to the reset control
register irrespective of whether the CPU has one and automatically
falling back to the debug control register if we didn't reset the
CPU. The side-effect is that we now properly reset future processors
without first having to add the system version to the list.


222391 27-May-2011 marcel

Wire the kernel using TLB1 entry 0 rather than entry 1. A more recent
U-Boot as found on the P1020RDB doesn't like it when we use entry 1
(for some reason) whereas an older U-Boot doesn't mind if we use entry
0. If anything else, this simplifies the code a bit.


222340 27-May-2011 marcel

o Swap the SVR numbers for MPC8533 & MPC8533E
o Add SVR defines for P1011(E), P1020(E), P2010(E) & P2020(E)


222327 26-May-2011 marcel

Don't assume we have a valid bootinfo pointer.


222309 26-May-2011 nwhitehorn

Add a missing isync.


222239 24-May-2011 nwhitehorn

Add RTC support for the LV1 clock on the PS3. The hypervisor won't let us
set it, but it's better than nothing.


222198 22-May-2011 attilio

Merge r221614,221696,221737,221840 from largeSMP project branch:
Rewrite atomic operations for powerpc in order to achieve the following:
- Produce a type-clean implementation (in terms of functions arguments
and returned values) for the primitives.
- Fix errors with _long() atomics where they ended up with the wrong
arguments to be accepted.
- Follow the sys/type.h specifics that define the numbered types starting
from standard C types.
- Let _ptr() version to not auto-magically cast arguments, but leave
the burden on callers, as _ptr() atomic is intended to be used
relatively rarely.

Fix cfi in order to support the latest point.

In collabouration with: bde
Tested by: andreast, nwhitehorn, jceel
MFC after: 2 weeks


222070 18-May-2011 attilio

Revert r222069,222068 as they were intended to be committed to the
largeSMP branch.

Reported by: pluknet


222069 18-May-2011 attilio

Fix warning spit out.

Reported by: sbruno


222068 18-May-2011 attilio

Fix newly introduced code.

Reported by: sbruno


221988 16-May-2011 nwhitehorn

Fix a </<= mixup. This could result in suboptimal performance on the last
page of physical memory.


221981 16-May-2011 nwhitehorn

Remove a useless check that served only to make 64-bit PPC systems
unbootable after r221855.

Submitted by: andreast
MFC after: 1 week


221855 13-May-2011 mdf

Move the ZERO_REGION_SIZE to a machine-dependent file, as on many
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.

Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).

Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.

Requested by: alc
MFC after: 1 week
MFC with: r221853


221738 10-May-2011 nwhitehorn

Only try to set up IPIs at boot on systems that actually have more than one
CPU. This fixes a panic observed on Heathrow-based systems without
SMP-capable PICs when the kernel had both options SMP and INVARIANTS.

MFC after: 5 days


221550 06-May-2011 nwhitehorn

SMP has worked perfectly for a very long time on 32-bit PowerPC on both
UP and SMP hardware. Enable it in GENERIC.

MFC after: 2 weeks


221526 06-May-2011 jhb

Retire isa_setup_intr() and isa_teardown_intr() and use the generic bus
versions instead. They were never needed as bus_generic_intr() and
bus_teardown_intr() had been changed to pass the original child device up
in 42734, but the ISA bus was not converted to new-bus until 45720.


221519 06-May-2011 nwhitehorn

Do not use Open Firmware to open the device and instead program its start
on our own. This prevents hangs at boot when using a bm(4) NIC where the
cable is not plugged in at boot time.

Obtained from: NetBSD
MFC after: 1 week


221173 28-Apr-2011 attilio

Add the watchdogs patting during the (shutdown time) disk syncing and
disk dumping.
With the option SW_WATCHDOG on, these operations are doomed to let
watchdog fire, fi they take too long.

I implemented the stubs this way because I really want wdog_kern_*
KPI to not be dependant by SW_WATCHDOG being on (and really, the option
only enables watchdog activation in hardclock) and also avoid to
call them when not necessary (avoiding not-volountary watchdog
activations).

Sponsored by: Sandvine Incorporated
Discussed with: emaste, des
MFC after: 2 weeks


221124 27-Apr-2011 rmacklem

This patch changes head so that the default NFS client is now the new
NFS client (which I guess is no longer experimental). The fstype "newnfs"
is now "nfs" and the regular/old NFS client is now fstype "oldnfs".
Although mounts via fstype "nfs" will usually work without userland
changes, an updated mount_nfs(8) binary is needed for kernels built with
"options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and
mount(8) binaries are needed to do mounts for fstype "oldnfs".
The GENERIC kernel configs have been changed to use options
NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER.
For kernels being used on diskless NFS root systems, "options NFSCL"
must be in the kernel config.
Discussed on freebsd-fs@.


220982 24-Apr-2011 mav

Switch the GENERIC kernels for all architectures to the new CAM-based ATA
stack. It means that all legacy ATA drivers are disabled and replaced by
respective CAM drivers. If you are using ATA device names in /etc/fstab or
other places, make sure to update them respectively (adX -> adaY,
acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential
numbers for each type in order of detection, unless configured otherwise
with tunables, see cam(4)).

ataraid(4) functionality is now supported by the RAID GEOM class.
To use it you can load geom_raid kernel module and use graid(8) tool
for management. Instead of /dev/arX device names, use /dev/raid/rX.


220818 19-Apr-2011 andreast

Add leading zeros when printing the physical memory chunks on __powerpc64__.

Approved by: nwhitehorn (mentor)


220642 14-Apr-2011 andreast

Adjust debugging string to match the actual function.

Approved by: nwhitehorn (mentor)


220639 14-Apr-2011 andreast

The macro MOEA_PVO_CHECK is empty and not used. It is a left over from the
NetBSD import. Remove the definition and all its occurrences.

Approved by: nwhitehorn (mentor)


220638 14-Apr-2011 andreast

Add stoppcbs[] arrays on powerpc(64) and have each CPU save its
current context in the IPI_STOP handler. Similar as done on other
architectures.

Approved by: nwhitehorn (mentor)


220597 13-Apr-2011 nwhitehorn

Make sure that extra threads in 32-bit processes stay in 32-bit mode. This
fixes operation of threaded 32-bit binaries on 64-bit kernels.


219718 17-Mar-2011 andreast

Remove duplicate definition of FIRSTARG.

Approved by: nwhitehorn (mentor)


219624 13-Mar-2011 nwhitehorn

Don't sleep while setting the clock. This can cause panics when
periodic_resettodr() calls CLOCK_SETTIME() and smu tries to sleep while
running from a callout.

Reported by: Torfinn Ingolfsen


219523 11-Mar-2011 mdf

Mostly revert r219468, as I had misremembered the C standard regarding
the size of an extern array.

Keep one change from strncpy to strlcpy.


219468 10-Mar-2011 mdf

Use MAXPATHLEN rather than the size of an extern array when copying the
kernel name. Also consistenly use strlcpy().

Suggested by: Warner Losh


219428 09-Mar-2011 nwhitehorn

Fix whitespace nit.


219405 08-Mar-2011 dchagin

Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with: kib

MFC after: 2 Week


218824 18-Feb-2011 nwhitehorn

Turn off default generation of userland dot symbols on powerpc64 now that
we have a binutils that supports it. Kernel dot symbols remain on to assist
DDB.


218773 17-Feb-2011 alc

Remove pmap fields that are either unused or not fully implemented.

Discussed with: kib


218195 02-Feb-2011 mdf

Put the general logic for being a CPU hog into a new function
should_yield(). Use this in various places. Encapsulate the common
case of check-and-yield into a new function maybe_yield().

Change several checks for a magic number of iterations to use
should_yield() instead.

MFC after: 1 week


218184 02-Feb-2011 marcel

Rename INTR_VEC to MAP_IRQ. From the OFW or FDT we obtain a
PIC handle with interrupt pin. This we map to the resource
called SYS_RES_IRQ.


218081 29-Jan-2011 nwhitehorn

Fix boot on SMP systems after r218075 by delaying CPU binding until a
SYSINIT.

Reviewed by: marcel


218075 29-Jan-2011 marcel

Fix the interrupt code, broken 7 months ago. The interrupt framework
already supported nested PICs, but was limited to having a nested
AT-PIC only. With G5 support the need for nested OpenPIC controllers
needed to be added. This was done the wrong way and broke the MPC8555
eval system in the process.

OFW, as well as FDT, describe the interrupt routing in terms of a
controller and an interrupt pin on it. This needs to be mapped to a
flat and global resource: the IRQ. The IRQ is the same as the PCI
intline and as such needs to be representable in 8 bits. Secondly,
ISA support pretty much dictates that IRQ 0-15 should be reserved
for ISA interrupts, because of the internal workins of south bridges.
Both were broken.

This change reverts revision 209298 for a big part and re-implements
it simpler. In particular:
o The id() method of the PIC I/F is removed again. It's not needed.
o The openpic_attach() function has been changed to take the OFW
or FDT phandle of the controller as a second argument. All bus
attachments that previously used openpic_attach() as the attach
method of the device I/F now implement as bus-specific method
and pass the phandle_t to the renamed openpic_attach().
o Change powerpc_register_pic() to take a few more arguments. In
particular:
- Pass the number of IPIs specificly. The number of IRQs carved
out for a PIC is the sum of the number of int. pins and IPIs.
- Pass a flag indicating whether the PIC is an AT-PIC or not.
This tells the interrupt framework whether to assign IRQ 0-15
or some other range.
o Until we implement proper multi-pass bus enumeration, we have to
handle the case where we need to map from PIC+pin to IRQ *before*
the PIC gets registered. This is done in a similar way as before,
but rather than carving out 256 IRQs per PIC, we carve out 128
IRQs (124 pins + 4 IPIs). This is supposed to handle the G5 case,
but should really be fixed properly using multiple passes.
o Have the interrupt framework set root_pic in most cases and not
put that burden in PIC drivers (for the most part).
o Remove powerpc_ign_lookup() and replace it with powerpc_get_irq().
Remove IGN_SHIFT, INTR_INTLINE and INTR_IGN.

Related to the above, fix the Freescale PCI controller driver, broken
by the FDT code. Besides not attaching properly, bus numbers were
assigned improperly and enumeration was broken in general. This
prevented the AT PIC from being discovered and interrupt routing to
work properly. Consequently, the ata(4) controller stopped functioning.

Fix the driver, and FDT PCI support, enough to get the MPC8555CDS
going again. The FDT PCI code needs a whole lot more work.

No breakages are expected, but lackiong G5 hardware, it's possible
that there are unpleasant side-effects. At least MPC85xx support is
back to where it was 7 months ago -- it's amazing how badly support
can be broken in just 7 months...

Sponsored by: Juniper Networks


218074 29-Jan-2011 marcel

Have nexus behave the same as the one on ARM (marvell SoCs), so as to
prevent warnings during boot WRT to the fdtbus attachment.


218073 29-Jan-2011 marcel

Introduce macro FDT_MAP_IRQ to map from an interrupt controller and
interrupt pin pair to a global IRQ number. When multiple PICs exist
on a board, the interrupt pin alone is not unique.


217896 26-Jan-2011 dchagin

Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after: 1 month


217756 23-Jan-2011 nwhitehorn

Disable ATAPI DMA unconditionally on Apple Kauai ATA controllers, like it
is on the MacIO ones. It appears to be unreliable on all DBDMA-based
controllers for unknown reasons, which should be figured out eventually.

Tested by: Torfinn Ingolfsen
MFC after: 1 week


217688 21-Jan-2011 pluknet

Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.

Submitted by: perryh pluto.rain.com (previous version)
Reviewed by: jhb
Approved by: kib (mentor)
Tested by: universe


217659 20-Jan-2011 andreast

Remove unused variables. Spotted by a cppcheck
(devel/cppcheck, http://sourceforge.net/projects/cppcheck) run.

Approved by: nwhitehorn (mentor)


217658 20-Jan-2011 andreast

Correct parsing of the grackle and uninorthpci ranges property.

Approved by: nwhitehorn (mentor)


217639 20-Jan-2011 nwhitehorn

Correct parsing of the cpcht ranges property.

Submitted by: andreast
MFC after: 2 weeks


217561 18-Jan-2011 kib

For architectures not using direct map , and requiring real KVA page for
sf buf allocation, use wakeup() instead of wakeup_one() to notify sf
buffer waiters about free buffer.

sf_buf_alloc() calls msleep(PCATCH) when SFB_CATCH flag was given,
and for simultaneous wakeup and signal delivery, msleep() returns
EINTR/ERESTART despite the thread was selected for wakeup_one(). As
result, we loose a wakeup, and some other waiter will not be woken up.

Reported and tested by: az
Reviewed by: alc, jhb
MFC after: 1 week


217523 17-Jan-2011 marcel

Support booting non FDT-capable loaders:
1. Allow embedding the FDT into the kernel, just like PowerPC/book-E.
2. If the loader passes us a pointer to the bootinfo structure, save
it and use it to fill in the gaps (e.g. bus frequencies, etc).


217515 17-Jan-2011 jkim

Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.

MFC after: 1 month


217459 15-Jan-2011 marcel

Don't redefine MODINFOMD_BOOTINFO as MODINFOMD_DTBP. This
breaks support for older loaders. Add MODINFOMD_DTBP as
a new tag instead.


217451 15-Jan-2011 andreast

Remove unused variables. Spotted by a cppcheck
(devel/cppcheck, http://sourceforge.net/projects/cppcheck) run.

Approved by: nwhitehorn (mentor)


217400 14-Jan-2011 kib

Enable shared page for the signal trampolines on PowerPC.

Reviewed and tested by: nwhitehorn


217341 13-Jan-2011 nwhitehorn

Fix handling of NX pages on capable CPUs. Thanks to kib for prodding me
in the right direction.


217286 11-Jan-2011 andreast

Add new functions, fcu_fan_set_pwm and fcu_fan_get_pwm, to set and get
the pwm values. We can now set the fan's speed of a PWM controlled fan
with % numbers between 30 and 100 % instead of trying to model a
% number based on rpm.
The fcu chip offers both, the dutycycle and the rpm value of the PWM
controlled fans. I added the rpm value to the list of information
available via sysctl(8).

Tested by: Paul Mather <paul at gromit dlib vt edu>

Approved by: nwhitehorn (mentor)


217265 11-Jan-2011 jhb

Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by: bde


217192 09-Jan-2011 kib

Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h.
Update the outdated comments describing MAXSLP and the process
selection algorithm for swap out.

Comments wording and reviewed by: alc


217181 09-Jan-2011 das

We don't support any floating point types larger than double on
powerpc, so DECIMAL_DIG should be 17.


217156 08-Jan-2011 tijl

White space changes to align comments. The mips and powerpc _inttypes.h
are now exactly the same.

Approved by: kib (mentor)


217155 08-Jan-2011 tijl

Rename PRIreg helper macro to PRIptr to better reflect its use. Registers
and pointers don't always have the same size, e.g. the __mips_n32 ABI
(ILP32) has 64 bit registers but 32 bit pointers.

On mips introduce PRIptr to fix the format specifier for (u)intptr_t.

Prefix PRI64 and PRIptr with underscores because macro names starting with
PRI[a-zX] are reserved for future use.

Approved by: kib (mentor)


217147 08-Jan-2011 tijl

On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]

Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.

Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.

Suggested by: bde [1]
Approved by: kib (mentor)


217146 08-Jan-2011 tijl

On 32 bit architectures define (u)int64_t as (unsigned) long long instead
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.

The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.

The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".

Discussed with: bde
Approved by: kib (mentor)


217145 08-Jan-2011 tijl

Fix types of some values in machine/_limits.h.

On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.

Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.

While here, correct some comments.

Reviewed by: bde
Approved by: kib (mentor)


217128 07-Jan-2011 tijl

Remove unused support for 64 bit long on 32 bit architectures.

It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.

Discussed with: bde
Approved by: kib (mentor)


217097 07-Jan-2011 kib

Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the
initial stack protection set by the kernel image activator.


217072 06-Jan-2011 jhb

Remove bogus usage of INTR_FAST. "Fast" interrupts are now indicated by
registering a filter handler rather than a threaded handler. Also remove
a bogus use of INTR_MPSAFE for a filter.


217065 06-Jan-2011 andreast

Remove unused variables. Spotted by a cppcheck
(devel/cppcheck, http://sourceforge.net/projects/cppcheck) run.

Approved by: nwhitehorn (mentor)


217054 06-Jan-2011 nwhitehorn

Unbreak the LINT build. PS3 kernels can only be built 64-bit, and LINT is
built for both architectures. We need a better solution here.


217044 06-Jan-2011 nwhitehorn

Import support for the Sony Playstation 3 using the OtherOS feature
available on firmwares 3.15 and earlier.

Caveats: Support for the internal SATA controller is currently missing,
as is support for framebuffer resolutions other than 720x480. These
deficiencies will be remedied soon.

Special thanks to Peter Grehan for providing the hardware that made this
port possible, and thanks to Geoff Levand of Sony Computer Entertainment
for advice on the LV1 hypervisor.


217027 05-Jan-2011 andreast

Fix null string handling in ofw_real_nextprop function. Pass the right
length to ofw_real_map in case of a null string.
This makes ofwdump(8) work correctly when trying to print all properties
with ofwdump -p.

Approved by: nwhitehorn (mentor)


216765 28-Dec-2010 nwhitehorn

Only keep track of PTE validity statistics for pages not locked in the
table. The 'locked' attribute is used to circumvent the regular page table
locking for some special pages, with the result that including locked pages
here causes races when updating the stats.


216589 20-Dec-2010 nwhitehorn

Memory can be laid out with large gaps on 64-bit PowerPC, so switch to
VM_PHYSSEG_SPARSE.


216563 19-Dec-2010 nwhitehorn

Garbage-collect unused variable.


216383 11-Dec-2010 nwhitehorn

Add some isync()s related to the 64-bit MMU scratch page to avoid race
conditions on its invalidation.


216193 05-Dec-2010 nwhitehorn

Switch which software-reserved bit is used to designate a locked PTE
to correspond to the definition used by the PAPR spec so that its PTE
insertion algorithm will properly respect it.


216174 04-Dec-2010 nwhitehorn

Add an abstraction layer to the 64-bit AIM MMU's page table manipulation
logic to support modifying the page table through a hypervisor. This
uses KOBJ inheritance to provide subclasses of the base 64-bit AIM MMU
class with additional methods for page table manipulation.

Many thanks to Peter Grehan for suggesting this design and implementing
the MMU KOBJ inheritance mechanism.


216154 03-Dec-2010 nwhitehorn

Provide a simple IOMMU framework on PowerPC, which is required to support
PPC hypervisors.


216143 03-Dec-2010 brucec

Revert r216134. This checkin broke platforms where bus_space are macros:
they need to be a single statement, and do { } while (0) doesn't work in this
situation so revert until a solution can be devised.


216134 02-Dec-2010 brucec

Disallow passing in a count of zero bytes to the bus_space(9) functions.

Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.

PR: kern/80980
Discussed with: jhb


216122 02-Dec-2010 nwhitehorn

Define bswap macros for constants to allow the compiler to pre-compute
byte-swapped versions of compile-time constants. This allows use of
bswap() and htole*() in initializers, which is required to cross-build
btxld.

Obtained from: sparc64


216083 30-Nov-2010 marius

Several chipset drivers alter parameters relevant for the DMA tag creation,
i.e. alignment, max_address, max_iosize and segsize (only max_address is
thought to have an negative impact regarding this issue though), after
calling ata_dmainit() either directly or indirectly so these values have
no effect or at least no effect on the DMA tags and the defaults are used
for the latter instead. So change the drivers to set these parameters
up-front and ata_dmainit() to honor them.

This file was missed in r216013.

Submitted by: nwhitehorn


215701 22-Nov-2010 dim

After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files. A better long-term solution is
still being considered. This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.


215317 14-Nov-2010 dim

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.


215197 12-Nov-2010 nwhitehorn

Partially revert r215182. There appears to be a silicon bug on the 970
that causes AP bringup to fail if some of the Cell HID-register code
is anywhere in the instruction stream. Pending a better solution, cache
performance on SMP Cell systems running without a hypervisor will be
suboptimal.


215182 12-Nov-2010 nwhitehorn

Add CPU support code for the IBM Cell Broadband Engine.


215163 12-Nov-2010 nwhitehorn

Remove use of a separate ofw_pmap on 32-bit CPUs. Many Open Firmware
mappings need to end up in the kernel anyway since the kernel begins
executing in OF context. Separating them adds needless complexity,
especially since the powerpc64 and mmu_oea64 code gave up on it a long
time ago.

As a side effect, the PPC ofw_machdep code is no longer AIM-specific,
so move it to powerpc/ofw.


215160 12-Nov-2010 nwhitehorn

Remove or conditionalize some hypervisor-unfriendly instruction sequences.


215159 12-Nov-2010 nwhitehorn

Add some platform KOBJ extensions and continue integrating PowerPC
hypervisor infrastructure support:
- Fix coexistence of multiple platform modules in the same kernel
- Allow platform modules to provide an SMP topology
- PowerPC hypervisors limit the amount of memory accessible in real mode.
Allow the platform modules to specify the maximum real-mode address,
and modify the bits of the kernel that need to allocate
real-mode-accessible buffers to respect this limits.


215158 12-Nov-2010 nwhitehorn

Fix an error in r215067. An existing /chosen/mmu but missing translations
property just means we shouldn't add any translations, not that we should
panic.


215157 12-Nov-2010 nwhitehorn

Centralize CPU idle routines into powerpc/cpu.c and use the same
cpu_idle_hook mechanism that x86 uses for overriding the idle routine.
This is required for supporting ilding the CPU under PowerPC hypervisors.


215121 11-Nov-2010 raj

Fix typo in the comment.


215119 11-Nov-2010 raj

Use local TLB_UNLOCKED marker instead of MTX_UNOWNED for Book-E PowerPC trap
routines.

This unbreaks Book-E build after the recent machine/mutex.h removal.

While there move tlb_*lock() prototypes to machine/tlb.h.

Submitted by: jhb


215107 11-Nov-2010 nwhitehorn

Add support for the IMISS, DLMISS, and DSMISS traps required to run
FreeBSD on a G2 core.

PR: powerpc/111296
Submitted by: Andrew Turner


215101 10-Nov-2010 nwhitehorn

Entering deep nap mode on the 970MP requires that both MSR[NAP] and
MSR[DEEPNAP] be set, not just MSR[DEEPNAP]. Fixing this reduces the idle
temperature of my CPUs from 57 to 38 degrees and makes one-shot timer
mode work properly.

Hint from: mav
MFC after: 4 days


215100 10-Nov-2010 nwhitehorn

Disabling CPU NAP modes during SMU commands is a hack needed only on U3
systems. Don't use it on non-U3 systems to allow cpu_idle() to work
correctly.


215067 10-Nov-2010 nwhitehorn

Make AIM early-boot code function correctly without Open Firmware.


215054 09-Nov-2010 jhb

- Remove <machine/mutex.h>. Most of the headers were empty, and the
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.

Suggested by: bde (1, 2)


215052 09-Nov-2010 jhb

Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.


214835 05-Nov-2010 jhb

Adjust the order of operations in spinlock_enter() and spinlock_exit() to
work properly with single-stepping in a kernel debugger. Specifically,
these routines have always disabled interrupts before increasing the nesting
count and restored the prior state of interrupts after decreasing the nesting
count to avoid problems with a nested interrupt not disabling interrupts
when acquiring a spin lock. However, trap interrupts for single-stepping
can still occur even when interrupts are disabled. Now the saved state of
interrupts is not saved in the thread until after interrupts have been
disabled and the nesting count has been increased. Similarly, the saved
state from the thread cannot be read once the nesting count has been
decreased to zero. To fix this, use temporary variables to store interrupt
state and shuffle it between the thread's MD area and the appropriate
registers.

In cooperation with: bde
MFC after: 1 month


214749 03-Nov-2010 nwhitehorn

Fix two mistakes on 32-bit systems. The slbmte code in syscall() is 64-bit
only, and should be protected with an ifdef, and the no-execute bit in
32-bit set_user_sr() should be set before the comparison, not after, or
it will never match.


214739 03-Nov-2010 nwhitehorn

Clean up the user segment handling code a little more. Now that
set_user_sr() itself caches the user segment VSID, there is no need for
cpu_switch() to do it again. This change also unifies the 32 and 64-bit
code paths for kernel faults on user pages and remaps the user SLB slot
on 64-bit systems when taking a syscall to avoid some unnecessary segment
exception traps.


214617 01-Nov-2010 alc

Implement pmap_is_prefaultable().

Reviewed by: nwhitehorn


214610 31-Oct-2010 nwhitehorn

Add a security nit to recent copyin/out changes: map the user segment
no-execute in case of exploitable kernel bugs.

MFC after: 1 week


214607 31-Oct-2010 nwhitehorn

Next-to-leading-order perturbation of synchronization operations for
switching the user segment register. All races should now be closed and
a minimum of pipelines flushes be required to close them.


214603 31-Oct-2010 nwhitehorn

Add a driver for the Apple Uninorth AGP host bridge found in all PowerPC
Macintoshes with an AGP bus.


214601 31-Oct-2010 nwhitehorn

Add some missing parentheses so that moea_bat_mapped() actually works.

Submitted by: alc
MFC after: 3 days


214575 30-Oct-2010 nwhitehorn

Allow access to the HT I/O port space on the IBM CPC9X5 northbridge chips.

MFC after: 2 weeks


214574 30-Oct-2010 nwhitehorn

Restructure the way the copyin/copyout segment is stored to prevent a
concurrency bug. Since all SLB/SR entries were invalidated during an
exception, a decrementer exception could cause the user segment to be
invalidated during a copyin()/copyout() without a thread switch that
would cause it to be restored from the PCB, potentially causing the
operation to continue on invalid memory. This is now handled by explicit
restoration of segment 12 from the PCB on 32-bit systems and a check in
the Data Segment Exception handler on 64-bit.

While here, cause copyin()/copyout() to check whether the requested
user segment is already installed, saving some pipeline flushes, and
fix the synchronization primitives around the mtsr and slbmte
instructions to prevent accessing stale segments.

MFC after: 2 weeks


214348 25-Oct-2010 nwhitehorn

Don't create spurious /dev entries.

Submitted by: andreast


213904 15-Oct-2010 andreast

Add three new drivers for fan control and temperature reading on the
PowerMac7,2.

- The fcu driver lets us read and write the fan RPMs for all fans in the
PowerMac7,2. This driver is PowerMac specific.
- The ds1775 is a driver to read the temperature for the drive bay sensor.
- The max6690 is another driver to read temperatures. Here it is used to
read the inlet, the backside and the U3 heatsink temperature.

An additional driver, the ad7417, will follow later.

Thanks to nwhitehorn for guiding me through this driver development.

Approved by: nwhitehorn (mentor)


213456 05-Oct-2010 nwhitehorn

Handle vector assist traps without a kernel panic, by setting denormalized
values to zero. A correct solution would involve emulating vector
operations on denormalized values, but this has little effect on accuracy
and is much less complicated for now.

MFC after: 2 weeks


213407 04-Oct-2010 nwhitehorn

Follow exactly the steps in architecture manual for correctly invalidating
TLB entries instead of trying to cut corners.


213383 03-Oct-2010 nwhitehorn

Add a memory-range interface to /dev/mem on PowerPC using PAT attributes.
Unlike actual MTRR, this only controls the mapping attributes for
subsequent mmap() of /dev/mem. Nonetheless, the support is sufficiently
MTRR-like that Xorg can use it, which translates into an enormous increase
in graphics performance on PowerPC.

MFC after: 2 weeks


213360 02-Oct-2010 nwhitehorn

Fix some KTR arguments that were breaking the LINT build.

Pointy hat to: me


213336 01-Oct-2010 nwhitehorn

Map the Open Firmware framebuffer console with write combining turned on,
and set memory attributes appropriately for mmap() calls on /dev/console.
Xorg no longer uses /dev/console to mmap the framebuffer, so framebuffer
write combining support in X will arrive in the next patch.


213335 01-Oct-2010 nwhitehorn

Fix pmap_page_set_memattr() behavior in the presence of fictitious pages
by just caching the mode for later use by pmap_enter(), following amd64.
While here, correct some mismerges from mmu_oea64 -> mmu_oea and clean
up some dead code found while fixing the fictitious page behavior.


213307 30-Sep-2010 nwhitehorn

Add support for memory attributes (pmap_mapdev_attr() and friends) on
PowerPC/AIM. This is currently stubbed out on Book-E, since I have no
idea how to implement it there.


213282 29-Sep-2010 neel

Fix bogus error message from bus_dmamem_alloc() about incorrect alignment.

The check for alignment should be made against the physical address and not
the virtual address that maps it.

Sponsored by: NetApp
Submitted by: Will McGovern (will at netapp dot com)
Reviewed by: mjacob, jhb


213180 26-Sep-2010 davidxu

Follow r213098, kernel POSIX semaphore module is no longer
needed.


213098 24-Sep-2010 davidxu

Now userland POSIX semaphore is based on umtx. The kernel module
is only used to support binary compatible, if want to run old
binary, you need to kldload the module.


212722 16-Sep-2010 nwhitehorn

Split the SLB mirror cache into two kinds of object, one for kernel maps
which are similar to the previous ones, and one for user maps, which
are arrays of pointers into the SLB tree. This changes makes user SLB
updates atomic, closing a window for memory corruption. While here,
rearrange the allocation functions to make context switches faster.


212715 16-Sep-2010 nwhitehorn

Replace the SLB backing store splay tree used on 64-bit PowerPC AIM
hardware with a lockless sparse tree design. This marginally improves
the performance of PMAP and allows copyin()/copyout() to run without
acquiring locks when used on wired mappings.

Submitted by: mdf


212687 15-Sep-2010 andreast

Increase register access delay to deal with the high-latency I2C
chipset found in some models of Powermac G5.

Approved by: nwhitehorn (mentor)


212627 15-Sep-2010 grehan

Introduce inheritance into the PowerPC MMU kobj interface.

include/mmuvar.h - Change the MMU_DEF macro to also create the class
definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT,
which has an extra parameter specifying the MMU class to inherit methods
from. Update the comments at the start of the header file to describe the
new macros.

booke/pmap.c
aim/mmu_oea.c
aim/mmu_oea64.c - Collapse mmu_def_t declaration into updated MMU_DEF macro

The MMU_DEF_INHERIT macro will be used in the PS3 MMU implementation to
allow it to inherit the stock powerpc64 MMU methods.

Reviewed by: nwhitehorn


212597 14-Sep-2010 grehan

Resurrect PSIM support by moving the cacheline size-detection warning
printf outside of the MMU-disabled region. A call into OpenFirmware
with the MMU off resulted in an internal PSIM assert.


212586 13-Sep-2010 nwhitehorn

Fix a missing set of parantheses that could cause recent versions of libthr
to crash deferencing a NULL pointer to the user context on powerpc64
systems with COMPAT_FREEBSD32 defined.


212559 13-Sep-2010 nwhitehorn

Fix a subtle bug uncovered by the recent one-shot timer import in which
any spin locks acquired between the enabling of interrupts in
machdep_ap_bootstrap() and the invocation of the scheduler would fail to
have interrupts disabled due to the fake spinlock already held by the
idle thread. sched_throw(NULL) will enable interrupts by itself when
exiting this spinlock, so just let it do that and don't enable interrupts
here.


212556 13-Sep-2010 mav

Change call order to enable interrupts only after timer being programmed.

Submitted by: nwhitehorn


212541 13-Sep-2010 mav

Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.


212483 11-Sep-2010 nwhitehorn

ATAPI DMA does not seem to work completely reliably on Shasta controllers,
especially in conjunction with ATA_CAM, so disable it for now.


212477 11-Sep-2010 marius

Change OF_interpret() to also take an array of cell_t (missed in r209801).

Reviewed by: nwhitehorn


212460 11-Sep-2010 mav

Fix the build after r212453. IPI_STATCLOCK declaration is still needed
for build, though not really used.

Submitted by: andreast


212453 11-Sep-2010 mav

Update PowerPC event timer code to use new event timers infrastructure.

Reviewed by: nwitehorn
Tested by: andreast
H/W donated by: Gheorghe Ardelean


212413 10-Sep-2010 avg

bus_add_child: change type of order parameter to u_int

This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.

Followup to: r212213
MFC after: 10 days


212363 09-Sep-2010 nwhitehorn

Reorder statistics tracking and table lock acquisitions already in place
to avoid race conditions updating the PVO statistics.


212331 08-Sep-2010 nwhitehorn

Fix a printf specifier on 64-bit systems.


212322 08-Sep-2010 nwhitehorn

Fix a typo in the original import of this code from NetBSD that caused the
wrong element of the VSID bitmap array to be examined after a collision,
leading to reallocation of in-use VSIDs under some circumstances, with
attendant memory corruption. Also add an assert to check for this kind of
problem in the future.

MFC after: 4 days


212308 07-Sep-2010 nwhitehorn

Fix an error made in r209975 related to context ID allocation for 64-bit
PowerPC CPUs running a 32-bit kernel. This bug could cause in-use VSIDs
to be allocated again to another process, causing memory space overlaps
and corruption.

Reported by: linimon


212278 06-Sep-2010 nwhitehorn

Fix the same race condition on 32-bit AIM CPUs that was fixed for 64-bit
ones in r211967 involving VSID allocation.


212239 05-Sep-2010 mav

Make nexus report name and compat fields as pnpinfo for devices on the
first level of hierarchy, same as done on deeper levels.


212170 03-Sep-2010 grehan

- Bump MAXCPU to 4. Tested on a quad G5 with both 32 and 64-bit kernels.
A make buildkernel -j4 uses ~360% CPU.
- Bracket the AP spinup printf with a mutex to avoid garbled output.
- Enable SMP by default on powerpc64.

Reviewed by: nwhitehorn


212054 31-Aug-2010 nwhitehorn

Restructure how reset and poweroff are handled on PowerPC systems, since
the existing code was very platform specific, and broken for SMP systems
trying to reboot from KDB.

- Add a new PLATFORM_RESET() method to the platform KOBJ interface, and
migrate existing reset functions into platform modules.
- Modify the OF_reboot() routine to submit the request by hand to avoid
the IPIs involved in the regular openfirmware() routine. This fixes
reboot from KDB on SMP machines.
- Move non-KDB reset and poweroff functions on the Powermac platform
into the relevant power control drivers (cuda, pmu, smu), instead of
using them through the Open Firmware backdoor.
- Rename platform_chrp to platform_powermac since it has become
increasingly Powermac specific. When we gain support for IBM systems,
we will grow a new platform_chrp.


212053 31-Aug-2010 nwhitehorn

Remove some code made obsolete by the powerpc64 import.


212044 31-Aug-2010 nwhitehorn

Missed one place the SLB lock should be held in r211967.


211967 29-Aug-2010 nwhitehorn

Avoid a race in the allocation of new segment IDs that could result in
memory corruption on heavily loaded SMP systems.

MFC after: 2 weeks


211861 27-Aug-2010 nwhitehorn

pmap_mapdev() does not appear to actually need GIANT to be held here,
and asserting that is held breaks drm.

MFC after: 2 weeks


211515 19-Aug-2010 jhb

Remove unused KTRACE includes.


211483 19-Aug-2010 nwhitehorn

Unbreak the LINT kernel on powerpc64. Note that the LINT kernel
configuration is TARGET_ARCH specific and must be generated with
TARGET_ARCH set.

Reviewed by: imp


211412 17-Aug-2010 kib

Supply some useful information to the started image using ELF aux vectors.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.

Tested by: marius (sparc64)
MFC after: 1 month


211197 11-Aug-2010 jhb

Update various places that store or manipulate CPU masks to use cpumask_t
instead of int or u_int. Since cpumask_t is currently u_int on all
platforms this should just be a cosmetic change.


210939 06-Aug-2010 jhb

Add a new ipi_cpu() function to the MI IPI API that can be used to send an
IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that
constructed a mask for a single CPU with calls to ipi_cpu() instead. This
will matter more in the future when we transition from cpumask_t to
cpuset_t for CPU masks in which case building a CPU mask is more expensive.

Submitted by: peter, sbruno
Reviewed by: rookie
Obtained from: Yahoo! (x86)
MFC after: 1 month


210704 31-Jul-2010 nwhitehorn

Improve hash coverage for kernel page table entries by modifying the kernel
ESID -> VSID map function. This makes ZFS run stably on PowerPC under
heavy loads (repeated simultaneous SVN checkouts and updates).


210677 31-Jul-2010 nwhitehorn

Add support for the IBM Full-System Simulator (Mambo). This code has been
developed against the 970 and Cell simulators.


210663 30-Jul-2010 mdf

Add MALLOC_DEBUG_MAXZONES=8 to powerpc64 GENERIC configuration file.

Requested by: nwhitehorn
Approved by: zml (mentor)


210564 28-Jul-2010 mdf

Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma
zones for each malloc bucket size. The purpose is to isolate
different malloc types into hash classes, so that any buffer overruns
or use-after-free will usually only affect memory from malloc types in
that hash class. This is purely a debugging tool; by varying the hash
function and tracking which hash class was corrupted, the intersection
of the hash classes from each instance will point to a single malloc
type that is being misused. At this point inspection or memguard(9)
can be used to catch the offending code.

Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files.
The suggestion to have this on by default came from Kostik Belousov on
-arch.

This code is based on work by Ron Steinke at Isilon Systems.

Reviewed by: -arch (mostly silence)
Reviewed by: zml
Approved by: zml (mentor)


210550 27-Jul-2010 jhb

Very rough first cut at NUMA support for the physical page allocator. For
now it uses a very dumb first-touch allocation policy. This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
a CPU belongs to. Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
(VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
The MD code is required to populate an array of mem_affinity structures.
Each entry in the array defines a range of memory (start and end) and a
domain for the range. Multiple entries may be present for a single
domain. The list is terminated by an entry where all fields are zero.
This array of structures is used to split up phys_avail[] regions that
fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
used when fulfulling a physical memory allocation. Right now the
per-domain freelists are listed in a round-robin order for each domain.
In the future a table such as the ACPI SLIT table may be used to order
the per-domain lookup lists based on the penalty for each memory domain
relative to a specific domain. The lookup lists may be examined via a
new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
pick a lookup list when allocating memory.

Reviewed by: alc


210369 22-Jul-2010 kib

When compat32 binary asks for the value of hw.machine_arch, report the
name of 32bit sibling architecture instead of the host one. Do the
same for hw.machine on amd64.

Add a safety belt debug.adaptive_machine_arch sysctl, to turn the
substitution off.

Reviewed by: jhb, nwhitehorn
MFC after: 2 weeks


210247 19-Jul-2010 raj

Eliminate FDT_IMMR_VA define.

This removes platform dependencies from <machine>/fdt.h for the benfit of
portability.


210033 13-Jul-2010 nwhitehorn

Remove obsolete code that sets SHMMAXPGS to a tiny value by default
on PowerPC.


210025 13-Jul-2010 nwhitehorn

Add GENERIC kernel config for powerpc64.


209975 13-Jul-2010 nwhitehorn

MFppc64:

Kernel sources for 64-bit PowerPC, along with build-system changes to keep
32-bit kernels compiling (build system changes for 64-bit kernels are
coming later). Existing 32-bit PowerPC kernel configurations must be
updated after this change to specify their architecture.


209958 12-Jul-2010 grehan

Fix printf specifier to allow 32/64 bit builds.

Obtained from: projects/ppc64


209950 12-Jul-2010 nwhitehorn

Unify ABI-related bits of the Book-E and AIM machdep routines
(exec_setregs, etc.) in order to simplify the addition of 64-bit support,
and possible future extension of the Book-E code to handle hard floating
point and Altivec.

MFC after: 1 month


209945 12-Jul-2010 nwhitehorn

MFppc64:

Provide ELF definitions for 64-bit PowerPC. This unbreaks the powerpc
loader build.


209908 11-Jul-2010 raj

Convert Freescale PowerPC platforms to FDT convention.

The following systems are affected:

- MPC8555CDS
- MPC8572DS

This overhaul covers the following major changes:

- All integrated peripherals drivers for Freescale MPC85XX SoC, which are
currently in the FreeBSD source tree are reworked and adjusted so they
derive config data out of the device tree blob (instead of hard coded /
tabelarized values).

- This includes: LBC, PCI / PCI-Express, I2C, DS1553, OpenPIC, TSEC, SEC,
QUICC, UART, CFI.

- Thanks to the common FDT infrastrucutre (fdtbus, simplebus) we retire
ocpbus(4) driver, which was based on hard-coded config data.

Note that world for these platforms has to be built WITH_FDT.

Reviewed by: imp
Sponsored by: The FreeBSD Foundation


209853 09-Jul-2010 nwhitehorn

The number after 2 is 3, not 4.

MFC after: 3 days


209852 09-Jul-2010 nwhitehorn

Remove an unnecessary include of opt_psim.h, which is not present on
powerpc64.


209851 09-Jul-2010 nwhitehorn

MFppc64:

Minor 64-bit-cleanliness upgrades and support for platform detection on
subtly-broken OF implementations like in the Mambo simulator.


209850 09-Jul-2010 nwhitehorn

MFppc64:

Use longs instead of ints as the native word type in bcopy(). This will
expand nicely on 64-bit systems.


209849 09-Jul-2010 nwhitehorn

MFppc64:

Check if devices are direct-mapped individually instead of just checking
the value of hw_direct_map.


209812 08-Jul-2010 nwhitehorn

Replace the existing PowerPC busdma implementation with the one from
amd64 (with slight modifications). This provides support for bounce
buffers, which are required on systems with RAM above 4 GB.


209804 08-Jul-2010 nwhitehorn

Make ofw_syscons work on 64-bit systems.


209803 08-Jul-2010 nwhitehorn

Fix several bugs in the real-mode Open Firmware implementation and provide
a virtual-mode version for use on 64-bit systems, which have 32-bit
firmware implementations and require similar constraints on addressing
to the real-mode implementation.


209801 08-Jul-2010 nwhitehorn

Change the argument type to OF_call_method to take an array of cell_t
instead of unsigned longs to prepare for platforms where they are not
the same.


209726 06-Jul-2010 nwhitehorn

It does not actually make sense to provide an IPI facility on non-root
PICs, so replace cpuid logic with an assert.


209725 06-Jul-2010 nwhitehorn

Fix interrupt distribution to multiple CPUs on systems with cascaded PICs.
Because slave PICs send all interrupts to their CPU 0 output line (which
is routed to a pin on the master PIC), changes to per-CPU register banks
like EOI on the slave PIC must be accessed for CPU 0, instead of the
CPU actually processing the interrupt.

Submitted by: Andreas Tobler


209724 06-Jul-2010 nwhitehorn

Move the EOI logic when starting ithreads into intr_machdep instead of
relying on it as a side effect of PIC_MASK() in the PIC drivers, and add
an inmplementation of assign_cpu() for the kernel interrupt layer.


209670 03-Jul-2010 nwhitehorn

Add a missing conditional. We should not bind the PIC interrupt unless
the interrupt's PIC (a) exists and (b) is the root PIC.

Reported by: Andreas Tobler


209639 02-Jul-2010 marcel

Remove the unneeded header <machine/intr.h>.


209621 01-Jul-2010 marcel

MFia64:
When compiling with profiling, we define PROF for userspace and GPROF
for the kernel.


209613 30-Jun-2010 jhb

Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.


209591 29-Jun-2010 marcel

Fix profiling (part 1):
o Functions are 4-byte aligned for Book-E.
o We get compiled with -DPROF and not -DGPROF if profiling
is enabled.


209496 24-Jun-2010 marcel

Assign PCI intline values for ISA interrupts using the new INTR_VEC()
macro.


209495 24-Jun-2010 marcel

Remove debugging printf() -- that is, I assume it was for debugging :-)


209493 24-Jun-2010 marcel

Pass the device_t of the AT PIC driver to atpic_intr() so that
we don't have to use a global variable. Pass a NULL frame pointer
to the dispatch function just like openpic(4).


209489 23-Jun-2010 marcel

With openpic(4) using active-low as the default polarity, reconfigure
the internal interrupt sources as active-high. The internal interrupt
sources are disabled when programmed as active-low.

Note that the internal interrupts have no sense bit like the external
interrupts. We program them as edge-triggered to make sure we write a
0 value to a reserved register. It does not in any way say anything
about the sense of internal interrupt.


209486 23-Jun-2010 nwhitehorn

Configure interrupts on SMP systems to be distributed among all online
CPUs by default, and provide a functional version of BUS_BIND_INTR().
While here, fix some potential concurrency problems in the interrupt
handling code.


209485 23-Jun-2010 marcel

In the attach method, refactor to take into account that
BUS_GET_RESOURCE_LIST() can return a NULL pointer -- and
will for MPC85xx kernels.


209369 20-Jun-2010 nwhitehorn

Temporarily disable instruction relocation while setting up the kernel's
IBAT entry in early boot in order to prevent possible faults from races
between the instruction cache and the MMU.

PR: powerpc/148003
MFC after: 3 days


209316 18-Jun-2010 nwhitehorn

Missed commit in r209310: the IRQ number in INTR_VEC() should have
parantheses around it to allow arithmetic expressions to be passed.

Submitted by: Andreas Tobler


209310 18-Jun-2010 nwhitehorn

Add MSI support for PCI devices attached to the CPC925 and CPC945 bridges
found in Apple and IBM G5 systems.


209302 18-Jun-2010 nwhitehorn

Add support for the Keywest I2C controller in Apple uninorth northbridges.
Although the Keywest registers have only 1 byte of content, they are
secretly 4-byte registers, which became apparent from them moving on the
big-endian Uninorth version of the controller.


209299 18-Jun-2010 nwhitehorn

Change the default interrupt polarity on PowerPC systems from high to low.
On Apple systems at least, all the level interrupts are wired active low.
Before this change, our PIC programming only worked because Apple hardware
ignores the interrupt polarity bit on all interrupts except IRQ 0.


209298 18-Jun-2010 nwhitehorn

Provide for multiple, cascaded PICs on PowerPC systems, and extend the
OFW interrupt map interface to also return the device's interrupt parent.

MFC after: 8.1-RELEASE


209222 15-Jun-2010 nwhitehorn

Modify the console mouse pointer drawing routine to use single-byte writes
instead of 4-byte ones. Because the mouse pointer can start part way
through a character cell, 4-byte memory operations are not necessarily
aligned, triggering a fatal alignment exception when the console pointer
was moved on PowerPC G5 systems.

MFC after: 3 days


209114 12-Jun-2010 nwhitehorn

Make SMP work on MPC7400-based Apple desktops like the PowerMac3,3.


209048 11-Jun-2010 alc

Relax one of the new assertions in pmap_enter() a little. Specifically,
allow pmap_enter() to be performed on an unmanaged page that doesn't have
VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages.
(See, for example, pmap_remove_write().)


208990 10-Jun-2010 alc

Reduce the scope of the page queues lock and the number of
PG_REFERENCED changes in vm_pageout_object_deactivate_pages().
Simplify this function's inner loop using TAILQ_FOREACH(), and shorten
some of its overly long lines. Update a stale comment.

Assert that PG_REFERENCED may be cleared only if the object containing
the page is locked. Add a comment documenting this.

Assert that a caller to vm_page_requeue() holds the page queues lock,
and assert that the page is on a page queue.

Push down the page queues lock into pmap_ts_referenced() and
pmap_page_exists_quick(). (As of now, there are no longer any pmap
functions that expect to be called with the page queues lock held.)

Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever
be passed an unmanaged page. Assert this rather than returning "0"
and "FALSE" respectively.

ARM:

Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().

Push down the page queues lock inside of pmap_clearbit(), simplifying
pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write().
Additionally, this allows for avoiding the acquisition of the page
queues lock in some cases.

PowerPC/AIM:

moea*_page_exits_quick() and moea*_page_wired_mappings() will never be
called before pmap initialization is complete. Therefore, the check
for moea_initialized can be eliminated.

Push down the page queues lock inside of moea*_clear_bit(),
simplifying moea*_clear_modify() and moea*_clear_reference().

The last parameter to moea*_clear_bit() is never used. Eliminate it.

PowerPC/BookE:

Simplify mmu_booke_page_exists_quick()'s control flow.

Reviewed by: kib@


208871 06-Jun-2010 nwhitehorn

Add Open Firmware PNP info strings to GPIOs and Uninorth cells.

Submitted by: Andreas Tobler


208847 05-Jun-2010 nwhitehorn

Correct a harmless typo introduced when copying code from mmu_oea64.

Submitted by: alc
MFC after: 8.1-RELEASE


208846 05-Jun-2010 alc

Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.

Correct a typo in a nearby comment on sparc64.


208842 05-Jun-2010 nwhitehorn

Add a driver for the CPU temperature sensors attached over I2C on the
PowerMac 11,2.


208841 05-Jun-2010 nwhitehorn

Add support for the I2C busses hanging off Apple system management chips.


208840 05-Jun-2010 nwhitehorn

Utilize the Keywest I2C combined mode for messages with repeated starts.


208835 05-Jun-2010 nwhitehorn

Make sure that interrupt sense settings set after interrupts are enabled
are respected. This fixes loading the Apple onboard audio driver
(snd_ai2s) as a module after boot, which would previously cause a panic.

PR: powerpc/146888
MFC after: 5 days


208810 05-Jun-2010 alc

Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.


208720 01-Jun-2010 alc

In the case that mmu_booke_enter_locked() is changing the attributes of a
mapping but not changing the physical page being mapped, the wrong flags
were being inspected in order to determine whether or not to flush the
instruction cache. The effect of looking at the wrong flags was that the
instruction cache was never being flushed.

Reviewed by: marcel


208614 28-May-2010 raj

Prepare and extend OFW layer for FDT support.

o Let OFW_INIT() and OF_init() return status value.

o Provide helper routines for 'compatible' property handling.

o Only compile OF and OFW code, which is relevant in FDT scenario.

o Other minor cosmetics

Reviewed by: imp
Sponsored by: The FreeBSD Foundation


208574 26-May-2010 alc

Push down page queues lock acquisition in pmap_enter_object() and
pmap_is_referenced(). Eliminate the corresponding page queues lock
acquisitions from vm_map_pmap_enter() and mincore(), respectively. In
mincore(), this allows some additional cases to complete without ever
acquiring the page queues lock.

Assert that the page is managed in pmap_is_referenced().

On powerpc/aim, push down the page queues lock acquisition from
moea*_is_modified() and moea*_is_referenced() into moea*_query_bit().
Again, this will allow some additional cases to complete without ever
acquiring the page queues lock.

Reorder a few statements in vm_page_dontneed() so that a race can't lead
to an old reference persisting. This scenario is described in detail by a
comment.

Correct a spelling error in vm_page_dontneed().

Assert that the object is locked in vm_page_clear_dirty(), and restrict the
page queues lock assertion to just those cases in which the page is
currently writeable.

Add object locking to vnode_pager_generic_putpages(). This was the one
and only place where vm_page_clear_dirty() was being called without the
object being locked.

Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call
to vm_page_clear_dirty().

Change vnode_pager_generic_putpages() to the modern-style of function
definition. Also, change the name of one of the parameters to follow
virtual memory system naming conventions.

Reviewed by: kib


208538 25-May-2010 raj

Initial loader(8) support for Flattened Device Tree.

o This is disabled by default for now, and can be enabled using WITH_FDT at
build time.

o Tested with ARM and PowerPC.

Reviewed by: imp
Sponsored by: The FreeBSD Foundation


208504 24-May-2010 alc

Roughly half of a typical pmap_mincore() implementation is machine-
independent code. Move this code into mincore(), and eliminate the
page queues lock from pmap_mincore().

Push down the page queues lock into pmap_clear_modify(),
pmap_clear_reference(), and pmap_is_modified(). Assert that these
functions are never passed an unmanaged page.

Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m:
Contrary to what the comment says, pmap_mincore() is not simply an
optimization. Without a complete pmap_mincore() implementation,
mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED
because only the pmap can provide this information.

Eliminate the page queues lock from vfs_setdirty_locked_object(),
vm_pageout_clean(), vm_object_page_collect_flush(), and
vm_object_page_clean(). Generally speaking, these are all accesses
to the page's dirty field, which are synchronized by the containing
vm object's lock.

Reduce the scope of the page queues lock in vm_object_madvise() and
vm_page_dontneed().

Reviewed by: kib (an earlier version)


208453 23-May-2010 kib

Reorganize syscall entry and leave handling.

Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by: jhb, marcel, marius, nwhitehorn, stas
Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
stas (mips)
MFC after: 1 month


208405 21-May-2010 nwhitehorn

Now that single-threaded access to firmware is enforced by
IPI_RENDEZVOUS, the ofw mutex is irrelevant.


208364 20-May-2010 nwhitehorn

Fix a long-standing bug in the PowerPC OFW call function on SMP machines
where running ofwdump could cause hangs by forcing all secondary CPUs
into a busy wait with interrupts off during the call.

Following section 8.4 of the Open Firmware PowerPC processor binding,
the firmware is free to overwrite the system interrupt handlers during
OF calls, restoring the OS handlers on exit. On single CPU systems, this
process is invisible to the operating system. On multiple CPU systems,
taking any exception on a secondary CPU while an OF call is in progress
ends with that exception vectored into OF, resulting in a slow movement
of the entire system into firmware context and a machine hang.

MFC after: 3 days


208285 19-May-2010 nwhitehorn

Correct a typo.

Pointy hat to: me


208278 18-May-2010 raj

Provide missing members for Book-E pmap (and fix build).


208175 16-May-2010 alc

On entry to pmap_enter(), assert that the page is busy. While I'm
here, make the style of assertion used by pmap_enter() consistent
across all architectures.

On entry to pmap_remove_write(), assert that the page is neither
unmanaged nor fictitious, since we cannot remove write access to
either kind of page.

With the push down of the page queues lock, pmap_remove_write() cannot
condition its behavior on the state of the PG_WRITEABLE flag if the
page is busy. Assert that the object containing the page is locked.
This allows us to know that the page will neither become busy nor will
PG_WRITEABLE be set on it while pmap_remove_write() is running.

Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly
do copy-on-write-based zero-copy transmit on unmanaged or fictitious
pages, so don't even try. Previously, the call to pmap_remove_write()
would have failed silently.


208172 16-May-2010 nwhitehorn

Pull OF_quiesce() out of the MI Open Firmware layer and entirely into
PPC ofw_machdep.c, in recognition of its state as a machine specific hack.

Requested by: marius


208168 16-May-2010 nwhitehorn

It is not necessary (and in some cases harmful) to hardcode ata_kauai's
IRQ to 39 on K2 devices, as well as Shasta ones.

Reported by: Andreas Tobler


208167 16-May-2010 nwhitehorn

Enable smu(4) to report fan speeds on late-model Powermac G5s.


208162 16-May-2010 nwhitehorn

Relocate interrupt sense setting for K2 SATA from the ATA driver to the
OFW PCI layer and read the sense directly from the device tree instead
of guessing.

MFC after: 1 week


208152 16-May-2010 nwhitehorn

On PowerMac11,2 and (presumably) PowerMac12,1, we need to quiesce the
firmware in order to take over control of the SMU. Without doing this,
the firmware background process doing fan control will run amok as we
take over the system and crash the management chip.

This is limited to these two machines because our kernel is heavily
dependent on firmware accesses, and so quiescing firmware can cause
nasty problems.


208150 16-May-2010 nwhitehorn

On SMP G5 systems, sometimes the power-mode-data property is only found
on CPU 0, so look there if it is not otherwise available.


208149 16-May-2010 nwhitehorn

Add support for the U4 PCI-Express bridge chipset used in late-generation
Powermac G5 systems. MSI and several other things are not presently
supported.

The U3/U4 internal device support portions of this change were contributed
by Andreas Tobler.

MFC after: 1 week


207796 08-May-2010 alc

Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and
vm_page_try_to_free(). Consequently, push down the page queues lock into
pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and
pmap_remove_write().

Push down the page queues lock into Xen's pmap_page_is_mapped(). (I
overlooked the Xen pmap in r207702.)

Switch to a per-processor counter for the total number of pages cached.


207437 30-Apr-2010 alc

MFamd64/i386 r207205
Clearing a page table entry's accessed bit and setting the page's
PG_REFERENCED flag in pmap_protect() can't really be justified, so
don't do it.

Additionally, two changes that make this pmap behave like the others do:

Change pmap_protect() such that it calls vm_page_dirty() only if the
page is managed.

Change pmap_remove_write() such that it doesn't clear a page table
entry's accessed bit.


207410 30-Apr-2010 kmacy

On Alan's advice, rather than do a wholesale conversion on a single
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.

Supported by: Bitgravity Inc.

Discussed with: alc, jeffr, and kib


207269 27-Apr-2010 kib

Style: use #define<TAB> instead of #define<SPACE>.

Noted by: bde, pluknet gmail com
MFC after: 11 days


207155 24-Apr-2010 alc

Resurrect pmap_is_referenced() and use it in mincore(). Essentially,
pmap_ts_referenced() is not always appropriate for checking whether or
not pages have been referenced because it clears any reference bits
that it encounters. For example, in mincore(), clearing the reference
bits has two negative consequences. First, it throws off the activity
count calculations performed by the page daemon. Specifically, a page
on which mincore() has called pmap_ts_referenced() looks less active
to the page daemon than it should. Consequently, the page could be
deactivated prematurely by the page daemon. Arguably, this problem
could be fixed by having mincore() duplicate the activity count
calculation on the page. However, there is a second problem for which
that is not a solution. In order to clear a reference on a 4KB page,
it may be necessary to demote a 2/4MB page mapping. Thus, a mincore()
by one process can have the side effect of demoting a superpage
mapping within another process!


207152 24-Apr-2010 kib

Move the constants specifying the size of struct kinfo_proc into
machine-specific header files. Add KINFO_PROC32_SIZE for struct
kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add
CTASSERT for the size of struct kinfo_proc32.

Submitted by: pluknet
Reviewed by: imp, jhb, nwhitehorn
MFC after: 2 weeks


207077 22-Apr-2010 thompsa

Change USB_DEBUG to #ifdef and allow it to be turned off. Previously this had
the illusion of a tunable setting but was always turned on regardless.

MFC after: 1 week


206116 02-Apr-2010 marius

With r205496 in place we should ensure that nargs and nreturns are always
set to sane values as they no longer default to 0, otherwise some OFW
implementation might copy in or out arguments not based on what the actual
function takes but what ever stack garbage nargs and nreturns supply.

Reviewed by: nwhitehorn


205795 28-Mar-2010 nwhitehorn

Set hw.ofwfb.relax_mmap=1 by default. While these checks may be a good
idea in principle, X does not work without them on basically any hardware,
and this is probably the most frequent problem people run into on PowerPC.

Prodded by: rnoland
MFC after: 1 week


205642 25-Mar-2010 nwhitehorn

Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by: jhb


205569 23-Mar-2010 marcel

Fix an off-by-one bug for the number of slots on a PCI/PCI-X bus.
We failed to setup PCI devices on slot 31 and that's where the
SATA controller is for the P2020 eval board.


205535 23-Mar-2010 marcel

Add definitions for a 4th PCI host controller. No Freescale processor
has all 4 implemented, but across the processors we now support all the
combinations. For example, the MPC8533 doesn't have a PCI controller
at 0xA0000, but does at 0xB0000.


205527 23-Mar-2010 marcel

Enable power management for E500 cores. Use "doze" for now to make
sure the caches remain coherent. For single-core configurations and
with busdma changes we could eventually switch to "nap" and force
a D-cache invalidation as part of the DMA completion. To this end,
clear PSL_WE until after we handled the decrementer or external
interrupt as it tells us whether we just woke up or not.


205506 23-Mar-2010 nwhitehorn

Get nexus(4) out of the RTC business. The interface used by nexus(4)
in Open Firmware was Apple-specific, and we have complete coverage of Apple
system controllers, so move RTC responsibilities into the system controller
drivers. This avoids interesting problems from manipulating these devices
through Open Firmware behind the backs of their drivers.

Obtained from: NetBSD
MFC after: 2 weeks


205497 23-Mar-2010 nwhitehorn

Open Firmware on powerpc is generally non-reetrant, so serialize all
OF calls with a mutex.


205496 23-Mar-2010 nwhitehorn

Do not declare the various OFW command buffers static. It does not
appear to be necessary on either sparc64 or powerpc, and is a
concurrency nightmare.

Reviewed by: marius


205495 23-Mar-2010 marcel

Actually pass a pointer to the trapframe to powerpc_extr_interrupt().


205370 20-Mar-2010 nwhitehorn

Revisit locking in the 64-bit AIM PMAP. The PVO head for a page is
generally protected by the VM page queue mutex. Instead of extending the
table lock to cover the PVO heads, add some asserts that the page queue
mutex is in fact held. This fixes several LORs and possible deadlocks.

This also adds an optimization to moea64_kextract() useful for
direct-mapped quantities, like UMA buffers. Being able to use this from
inside UMA removes an additional LOR.


205356 20-Mar-2010 nwhitehorn

Let unin(4) attach to U3 controllers found on G5 machines.

Submitted by: Andreas Tobler


205163 15-Mar-2010 nwhitehorn

Fix two small bugs. The PowerPC 970 does not support non-coherent memory
access, and reflects this by autonomously writing LPTE_M into PTE entries.
As such, we should not panic if LPTE_M changes by itself. While here,
fix a harmless typo in moea64_sync_icache().


205116 13-Mar-2010 ed

Remove COMPAT_43TTY from stock kernel configuration files.

COMPAT_43TTY enables the sgtty interface. Even though its exposure has
only been removed in FreeBSD 8.0, it wasn't used by anything in the base
system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your
ports/packages are less than two years old, they will prefer termios
over sgtty.


204903 09-Mar-2010 nwhitehorn

Place interrupt handling in a critical section and remove double
counting in incrementing the interrupt nesting level. This fixes a number
of bugs in which the interrupt thread could be preempted by an IPI,
indefinitely delaying acknowledgement of the interrupt to the PIC, causing
interrupt starvation and hangs.

Reported by: linimon
Reviewed by: marcel, jhb
MFC after: 1 week


204719 04-Mar-2010 nwhitehorn

Fix an obvious lock escape and fix a typo in a comment.


204694 04-Mar-2010 nwhitehorn

Patch some more concurrency issues here. This expands the page table
lock to cover the PVOs, and removes the scratchpage PTEs from the PVOs
entirely to avoid the system trying to be helpful and rewriting them.


204692 04-Mar-2010 nwhitehorn

Rework smu(4) to be asynchronous. It turns out that the combination of
the automatic fan management and the polling in smu_run_cmd() was
putting my system interrupt load at 20%. This change reduces that to
0.4%.


204646 03-Mar-2010 joel

The NetBSD Foundation has granted permission to remove clause 3 and 4 from
the software.

Obtained from: NetBSD


204640 03-Mar-2010 joel

The NetBSD Foundation has granted permission to remove clause 3 and 4 from
their software.

Obtained from: NetBSD


204312 25-Feb-2010 nwhitehorn

Fix another bug involving /dev/mem and the OEA64 scratchpage. When
the scratchpage is updated, the PVO's physical address is updated as well.
This makes pmap_extract() begin returning non-zero values again, causing
the panic partially fixed in r204297. Fix this by excluding addresses
beyond virtual_end from consideration as KVA addresses, instead of allowing
addresses up to VM_MAX_KERNEL_ADDRESS.


204297 25-Feb-2010 nwhitehorn

Move the OEA64 scratchpage to the end of KVA from the beginning, and set
its PVO to map physical address 0 instead of kernelstart. This fixes a
situation in which a user process could attempt to return this address
via KVM, have it fault while being modified, and then panic the kernel
because (a) it is supposed to map a valid address and (b) it lies in the
no-fault region between VM_MIN_KERNEL_ADDRESS and virtual_avail.

While here, move msgbuf and dpcpu make into regular KVA space for
consistency with other implementations.


204296 25-Feb-2010 nwhitehorn

Provide an implementation of pmap_dev_direct_mapped() on OEA64. This is
required in order to be able to mmap the running kernel, which is turn
required to avoid fstat returning gibberish.


204270 24-Feb-2010 nwhitehorn

Add the ability to set SMU-based machines to restart automatically after
power loss.


204269 24-Feb-2010 nwhitehorn

Use dcbz instead of word stores for page zeroing, providing a factor of
3-4 speedup.


204268 24-Feb-2010 nwhitehorn

Close a race involving the OEA64 scratchpage. When the scratch page's
physical address is changed, there is a brief window during which its PTE
is invalid. Since moea64_set_scratchpage_pa() does not and cannot hold
the page table lock, it was possible for another CPU to insert a new PTE
into the scratch page's PTEG slot during this interval, corrupting both
mappings.

Solve this by creating a new flag, LPTE_LOCKED, such that
moea64_pte_insert will avoid claiming locked PTEG slots even if they
are invalid. This change also incorporates some additional paranoia
added to solve things I thought might be this bug.

Reported by: linimon


204218 22-Feb-2010 nwhitehorn

Provide a new useless feature: an led(4) interface for the system's sleep
LED.


204197 22-Feb-2010 nwhitehorn

Allow user programs to execute mfpvr instructions. Linux allows this, and
some math-related software like GMP expects to be able to use it to pick
a target appropriately.

MFC after: 1 week


204180 21-Feb-2010 nwhitehorn

Add a simple fan management callout to the SMU driver. This is designed
such that a fancier thermal management algorithm can be run from user
space, but the kernel will at least ensure your machine does not either
sound like a wind tunnel or catch fire.


204179 21-Feb-2010 nwhitehorn

Fix several mistakes in this file, in order to allow individual fan speeds
to be read and set correctly.


204128 20-Feb-2010 nwhitehorn

Reduce KVA pressure on OEA64 systems running in bridge mode by mapping
UMA segments at their physical addresses instead of into KVA. This emulates
the direct mapping behavior of OEA32 in an ad-hoc way. To make this work
properly required sharing the entire kernel PMAP with Open Firmware, so
ofw_pmap is transformed into a stub on 64-bit CPUs.

Also implement some more tweaks to get more mileage out of our limited
amount of KVA, principally by extending KVA into segment 16 until the
beginning of the first OFW mapping.

Reported by: linimon


204127 20-Feb-2010 nwhitehorn

Turn on experimental support for DEEPNAP on the 970MP.


204126 20-Feb-2010 nwhitehorn

Merge r198724 to Book-E. casuword() non-atomically read the current value
of its argument before atomically replacing it, which could occasionally
return the wrong value on an SMP system. This resulted in user mutex
operations hanging when using threaded applications.


204082 19-Feb-2010 nwhitehorn

Allow the SMU driver to read a variety of hardware sensors (possible
questions on the thermal calibration), and to read and set fan RPMs from
software. While here, fix a number of bugs.

Calibration code from: OpenBSD
MFC after: 2 weeks


204042 18-Feb-2010 nwhitehorn

Fix a bug where pages being removed from memory entirely no longer have
PVOs, and so the modified state of the page can no longer be communicated
to the VM layer, causing pages not to be flushed to swap when needed, in
turn causing memory corruption. Also make several correctness adjustments
to I-Cache synchronization and TLB invalidation for 64-bit Book-S CPUs.

Obtained from: projects/ppc64
Discussed with: grehan
MFC after: 2 weeks


203938 15-Feb-2010 attilio

Adjust style (following the already existing rules) for the newly
introduced option DEADLKRES.

Reported by: danfe, julian, avg


203924 15-Feb-2010 raj

Call the proper linkup routine in PowerPC Book-E machdep.

Submitted by: attilio
MFC after: 1 week


203758 10-Feb-2010 attilio

Add the options DEADLKRES (introducing the deadlock resolver thread) in
the 'debugging' section of any HEAD kernel and enable for the mainstream
ones, excluding the embedded architectures.
It may, of course, enabled on a case-by-case basis.

Sponsored by: Sandvine Incorporated
Requested by: emaste
Discussed with: kib


203352 01-Feb-2010 marcel

Make PCI Express host controllers functional, by:
1. checking whether there's a link before initializing devices
on the bus. When there's no link any access onto the bus
will wedge the CPU.
2. synthesizing the class & subclass so that the host controller
appears as a standard PCI bridge, rather than a PowerPC CPU.


203350 01-Feb-2010 marcel

Use the capability pointer to indicate whether the host controller is
PCI Express, rather than a bit-field (boolean). Saving the capability
pointer this way makes access to capability-specific configuration
registers easy and efficient.


203177 29-Jan-2010 marcel

Don't check the device ID. Instead, check the class, subclass and
programming I/F. New SoC designs have different device IDs, but
don't need special treatment. Consequently, we fail to probe and
attach for no other reason than not having added the device ID to
the code.

Bank on Freescale's sense of backward compatibility and assume
that if we find a host controller, we know how work with it.

This fixes detection of the PCI Express host controllers on
Freescale's QorIQ family of processors (P1, P2 and P4).


202634 19-Jan-2010 jhb

Move the examples for the 'hints' and 'env' keywords from various GENERIC
kernel configs into NOTES.

Reviewed by: imp


202019 10-Jan-2010 imp

Add INCLUDE_CONFIG_FILE in GENERIC on all non-embedded platforms.

# This is the resolution of removing it from DEFAULTS...

MFC after: 5 days


201813 08-Jan-2010 bz

In sys/<arch>/conf/Makefile set TARGET to <arch>. That allows
sys/conf/makeLINT.mk to only do certain things for certain
architectures.

Note that neither arm nor mips have the Makefile there, thus
essentially not (yet) supporting LINT. This would enable them
do add special treatment to sys/conf/makeLINT.mk as well chosing
one of the many configurations as LINT.

This is a hack of doing this and keeping it in a separate commit
will allow us to more easily identify and back it out.

Discussed on/with: arch, jhb (as part of the LINT-VIMAGE thread)
MFC after: 1 month


201758 07-Jan-2010 mbr

Remove extraneous semicolons, no functional changes.

Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week


201534 04-Jan-2010 imp

Revert 200594. This file isn't intended for these sorts of things.


201443 03-Jan-2010 brooks

Add vlan(4) to all GENERIC kernels.

MFC after: 1 week


201223 29-Dec-2009 rnoland

Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.

This replaces d_mmap() with the d_mmap2() implementation and also
changes the type of offset to vm_ooffset_t.

Purge d_mmap2().

All driver modules will need to be rebuilt since D_VERSION is also
bumped.

Reviewed by: jhb@
MFC after: Not in this lifetime...


200739 19-Dec-2009 marcel

Remove a warning in DELAY about large delays. In kern_shutdown.c
we use excessive delays quite habitually.


200594 16-Dec-2009 dougb

Add INCLUDE_CONFIG_FILE, and a note in comments about how to also
include the comments with CONFIGARGS


200182 06-Dec-2009 nwhitehorn

Unbreak build.

Pointy hat to: me


200171 06-Dec-2009 mav

MFp4:
Introduce ATA_CAM kernel option, turning ata(4) controller drivers into
cam(4) interface modules. When enabled, this options deprecates all ata(4)
peripheral drivers (ad, acd, ...) and interfaces and allows cam(4) drivers
(ada, cd, ...) and interfaces to be natively used instead.

As side effect of this, ata(4) mode setting code was completely rewritten
to make controller API more strict and permit above change. While doing
this, SATA revision was separated from PATA mode. It allows DMA-incapable
SATA devices to operate and makes hw.ata.atapi_dma tunable work again.

Also allow ata(4) controller drivers (except some specific or broken ones)
to handle larger data transfers. Previous constraint of 64K was artificial
and is not really required by PCI ATA BM specification or hardware.

Submitted by: nwitehorn (powerpc part)


200083 03-Dec-2009 nwhitehorn

The first argument of dcbz interprets r0 as a literal zero, not the second.
This worked before by accident.

MFC after: 1 week


200018 02-Dec-2009 nwhitehorn

Bump limits on PowerPC. This allows large executables like parts of LLVM
to function.

Reviewed by: grehan
Obtained from: NetBSD
MFC after: 2 weeks


199949 29-Nov-2009 nwhitehorn

Add atp(4) to powerpc GENERIC. Most late-generation Apple PowerPC laptops
have trackpads that do not work at all without this driver.


199886 28-Nov-2009 nwhitehorn

Add a CPU features framework on PowerPC and simplify CPU setup a little
more. This provides three new sysctls to user space:
hw.cpu_features - A bitmask of available CPU features
hw.floatingpoint - Whether or not there is hardware FP support
hw.altivec - Whether or not Altivec is available

PR: powerpc/139154
MFC after: 10 days


199868 27-Nov-2009 alc

Simplify the invocation of vm_fault(). Specifically, eliminate the flag
VM_FAULT_DIRTY. The information provided by this flag can be trivially
inferred by vm_fault().

Discussed with: kib


199669 22-Nov-2009 nwhitehorn

Garbage collect some code that was never compiled in to handle Altivec
during traps. It predates actual Altivec support and was never used.


199602 20-Nov-2009 marcel

Always allocate PCI/ISA interrupts as shareable so that shared
interrupts don't cause driver attach failures.


199533 19-Nov-2009 raj

Fix cpuid output on E500 core.


199226 12-Nov-2009 nwhitehorn

Provide a real fix to the too-many-translations problem when booting
from CD on 64-bit hardware to replace existing band-aids. This occurred
when the preloaded mdroot required too many mappings for the static
buffer.

Since we only use the translations buffer once, allocate a dynamic
buffer on the stack. This early in the boot process, the call chain
is quite short and we can be assured of having sufficient stack space.

Reviewed by: grehan


199135 10-Nov-2009 kib

Extract the code that records syscall results in the frame into MD
function cpu_set_syscall_retval().

Suggested by: marcel
Reviewed by: marcel, davidxu
PowerPC, ARM, ia64 changes: marcel
Sparc64 tested and reviewed by: marius, also sunv reviewed
MIPS tested by: gonzo
MFC after: 1 month


199108 09-Nov-2009 nwhitehorn

Spell sz correctly.

Pointed out by: jmallett


199084 09-Nov-2009 nwhitehorn

Increase the size of the OFW translations buffer to handle G5 systems
that use many translation regions in firmware, and add bounds checking
to prevent buffer overflows in case even the new value is exceeded.

Reported by: Jacob Lambert
MFC after: 3 days


198968 06-Nov-2009 marcel

Unbreak E500 builds. The inline assembly for the 970 CPUs
is invalid when compiling for BookE.


198731 31-Oct-2009 nwhitehorn

Unbreak cpu_switch(). The register allocator in my brain is clearly
broken. Also, Altivec context switching worked before only by accident,
but should work now by design.


198725 31-Oct-2009 nwhitehorn

Remove an unnecessary sync that crept in the last commit.


198724 31-Oct-2009 nwhitehorn

Fix a race in casuword() exposed by csup. casuword() non-atomically read
the current value of its argument before atomically replacing it, which
could occasionally return the wrong value on an SMP system. This resulted
in user mutex operations hanging when using threaded applications.


198723 31-Oct-2009 nwhitehorn

Loop on blocked threads when using ULE scheduler, removing an
XXX MP comment.


198722 31-Oct-2009 nwhitehorn

Garbage collect set_user_sr(), which is declared static inline and
never called.


198678 30-Oct-2009 nwhitehorn

Make procstat -k work on PowerPC by avoiding mistakenly using signed
compares with a low address (0x1000) and a high address
(the KVA kernel stack).


198588 29-Oct-2009 nwhitehorn

Turn off Altivec data-stream prefetching before going into power-save
mode on those CPUs that need it.


198507 27-Oct-2009 kib

In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by: davidxu
Tested by: pho
MFC after: 1 month


198445 24-Oct-2009 nwhitehorn

Turn on NAP mode on G5 systems, and refactor the HID0 setup code a little.
This makes my G5 Xserve sound slightly less like it is filled with
howling banshees.


198444 24-Oct-2009 nwhitehorn

Allow Heathrow-based machines to boot a kernel containing option SMP
without panicing.


198428 23-Oct-2009 nwhitehorn

Remove debugging printf that snuck in here.

Pointy hat to: me


198427 23-Oct-2009 nwhitehorn

Add some more paranoia to setting HID registers, and update the AIM
clock routines to work better with SMP. This makes SMP work fully and
stably on an Xserve G5.

Obtained from: Book-E (clock bits)


198400 23-Oct-2009 nwhitehorn

Do not map the trap vectors into the kernel's address space. They are
only used in real mode and keeping them mapped only serves to make NULL
a valid address, which results in silent NULL pointer deferences.

Suggested by: Patrick Kerharo
Obtained from: projects/ppc64


198378 23-Oct-2009 nwhitehorn

Add SMP support on U3-based G5 systems. This does not yet work perfectly:
at least on my Xserve, getting the decrementer and timebase on APs to tick
requires setting up a clock chip over I2C, which is not yet done.

While here, correct the 64-bit tlbie function to set the CPU to 64-bit
mode correctly.

Hardware donated by: grehan


198341 21-Oct-2009 marcel

o Introduce vm_sync_icache() for making the I-cache coherent with
the memory or D-cache, depending on the semantics of the platform.
vm_sync_icache() is basically a wrapper around pmap_sync_icache(),
that translates the vm_map_t argumument to pmap_t.
o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc
it replaces the pmap_page_executable() function, added to solve
the I-cache problem in uiomove_fromphys().
o In proc_rwmem() call vm_sync_icache() when writing to a page that
has execute permissions. This assures that when breakpoints are
written, the I-cache will be coherent and the process will actually
hit the breakpoint.
o This also fixes the Book-E PMAP implementation that was missing
necessary locking while trying to deal with the I-cache coherency
in pmap_enter() (read: mmu_booke_enter_locked).

The key property of this change is that the I-cache is made coherent
*after* writes have been done. Doing it in the PMAP layer when adding
or changing a mapping means that the I-cache is made coherent *before*
any writes happen. The difference is key when the I-cache prefetches.


198212 18-Oct-2009 nwhitehorn

Don't assume that physical addresses are identity mapped. This allows
the second processor on G5 systems to start. Note that SMP is still
non-functional on these systems because of IPI delivery problems.


197962 11-Oct-2009 nwhitehorn

Correct another typo. Actually save the condition register instead
of overwriting r12 by mistake.


197961 11-Oct-2009 nwhitehorn

Correct a typo here and actually save DSISR instead of overwriting it.


197933 10-Oct-2009 kib

Define architectural load bases for PIE binaries. Addresses were selected
by looking at the bases used for non-relocatable executables by gnu ld(1),
and adjusting it slightly.

Discussed with: bz
Reviewed by: kan
Tested by: bz (i386, amd64), bsam (linux)
MFC after: some time


197729 03-Oct-2009 bz

Make sure that the primary native brandinfo always gets added
first and the native ia32 compat as middle (before other things).
o(ld)brandinfo as well as third party like linux, kfreebsd, etc.
stays on SI_ORDER_ANY coming last.

The reason for this is only to make sure that even in case we would
overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo
would still be there and the system would be operational.

Reviewed by: kib
MFC after: 1 month


197316 18-Sep-2009 alc

Add a new sysctl for reporting all of the supported page sizes.

Reviewed by: jhb
MFC after: 3 weeks


197080 10-Sep-2009 nwhitehorn

Add a few SCSI controllers to GENERIC that can be found on Powermacs.
This allows installation onto SCSI disks as shipped, for example,
with the Powermac G3.

PR: powerpc/138543
Obtained from: sparc64
MFC after: 3 days


196994 08-Sep-2009 phk

Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating an
architecture specific include file containing the _ALIGN*
stuff which <sys/socket.h> needs.


196993 08-Sep-2009 nwhitehorn

Remove some debugging (KTR_VERBOSE) that crept into ppc GENERIC long ago
and is present on no other architectures by default.

MFC after: 4 days


196196 13-Aug-2009 attilio

* Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions. This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by: jhb
Tested by: pho, bz, rink
Approved by: re (kib)


195840 24-Jul-2009 jhb

Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses. The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by: alc
Approved by: re (kensmith)
MFC after: 2 weeks


195799 21-Jul-2009 raj

Do not use OCP85XX_LBC_OFF twice when accessing LBC registers on MPC85XX.

It turns LBC control registers were not programmed correctly on MPC85XX. We
were accessing bogus addresses as the base offset (OCP85XX_LBC_OFF) was
erroneously added during offset calculations. Effectively the state of LBC
control registers was not altered by the kernel initialization code, but
everything worked as long as we coincided to use the same settings (LBC decode
windows) as firmware has initialized.

Submitted by: Lukasz Wojcik
Reviewed by: marcel
Approved by: re (kensmith)
Obtained from: Semihalf


195649 12-Jul-2009 alc

Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes. Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures. The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map. The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

kmem_alloc_contig() can now be used to allocate kernel memory with
non-default memory attributes on amd64 and i386.

vm_page_alloc() and the device pager will set the memory attributes
for the real or fictitious page according to the object's default
memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386. In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by: re (kib)


195632 12-Jul-2009 nwhitehorn

Increase the size of the page table on 64-bit PowerPC machines as a
bandaid to prevent exhaustion of the primary and secondary hash groups
in the event of extreme stress on the PMAP layer (e.g. a forkbomb). This
wastes memory, and should be revised to properly handle PTEG spills instead.

Suggested by: grehan
Approved by: re (kensmith)


195376 05-Jul-2009 sam

Cleanup ALIGNED_POINTER:
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent

Reviewed by: bde, imp, marcel
Approved by: re (kensmith)


195295 02-Jul-2009 ed

Enable POSIX semaphores on all non-embedded architectures by default.

More applications (including Firefox) seem to depend on this nowadays,
so not having this enabled by default is a bad idea.

Proposed by: miwi
Patch by: Florian Smeets <flo kasimir com>
Approved by: re (kib)


195060 26-Jun-2009 alc

Correct the #endif comment.

Noticed by: jmallett
Approved by: re (kib)


195033 26-Jun-2009 alc

This change is the next step in implementing the cache control functionality
required by video card drivers. Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures. In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.

In collaboration with: jhb


194950 25-Jun-2009 raj

Include SMP support in the MPC85XX kernel by default.


194933 25-Jun-2009 jeff

- Add the right includes to use kmem_alloc(). This was broken by my
DPCPU commit.
Reported by: bz


194849 24-Jun-2009 raj

More precise description of the DS1553 driver.

Pointed out by: stas


194784 23-Jun-2009 jeff

Implement a facility for dynamic per-cpu variables.
- Modules and kernel code alike may use DPCPU_DEFINE(),
DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined
PCPU_*. Requires only one extra instruction more than PCPU_* and is
virtually the same as __thread for builtin and much faster for shared
objects. DPCPU variables can be initialized when defined.
- Modules are supported by relocating the module's per-cpu linker set
over space reserved in the kernel. Modules may fail to load if there
is insufficient space available.
- Track space available for modules with a one-off extent allocator.
Free may block for memory to allocate space for an extent.

Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas


194679 23-Jun-2009 nwhitehorn

Add cpufreq support on the PowerPC G5, along with a skeleton SMU driver
in order to slew CPU voltage during frequency changes. The OpenBSD SMU
driver was an extremely helpful reference for this.


194678 23-Jun-2009 nwhitehorn

Fix copy/paste typo in last revision. PMC0 control should be shifted 8
bits, not 6, on the PPC 970.


194632 22-Jun-2009 raj

DS1553 RTC module driver. On the MPC8555CDS system it hangs off of the LBC bus.

Obtained from: Semihalf


194630 22-Jun-2009 raj

Integrated I2C controller driver (found in MPC85xx and other SOC parts).

Obtained from: Freescale, Semihalf


194374 17-Jun-2009 nwhitehorn

Teach cpu_est_clockrate() about the G5's slightly different PMC. This
allows the boot messages to include the CPU speed and makes possible
the forthcoming cpufreq support for the PPC 970.


194123 13-Jun-2009 alc

Correct the method of waking the page daemon when the number of allocated
pv entries surpasses the high water mark. The problem was that the page
daemon would only be awakened the first time that the high water mark was
surpassed. (The variable "pagedaemon_waken" is a non-working vestige of
FreeBSD 4.x, in which it was external and reset by the page daemon whenever
it ran. This reset allowed subsequent wakeups by the pv entry allocator.)


194101 13-Jun-2009 raj

Fix Book-E/MPC85XX build. Some prototypes were wrong and got revealed with
the recent kobj signature checking.


194027 11-Jun-2009 avg

strict kobj signatures: fix adb_hb_controller_poll impl in powermac

the method return u_int, not void

Reviewed by: imp, current@
Approved by: jhb (mentor)


194025 11-Jun-2009 avg

strict kobj signatures: some ofw_setprop fixes

propname parameter is const

Reviewed by: imp, current@
Approved by: jhb (mentor)


193935 10-Jun-2009 imp

Move from using devclass_find_free_unit(ata_devclass, 0) to -1 for the
unit number. Basically they are the same...


193909 10-Jun-2009 grehan

Get the gdb/psim emulator functioning again.

aim/machdep.c:
- the RI status register bit needs to be set when doing the mtmsrd 64-bit
instruction test
- psim doesn't implement the dcbz instruction so the run-time cacheline
test fails. Set the cachline size to 32 to avoid infinite loops in
future calls to __syncicache()

aim/platform_chrp.c:
- if after iterating through / and a name property of "cpus" still isn't
found, just search directly for '/cpus'.
- psim doesn't put a "reg" property on it's cpu nodes, so assume 0
since it is uniprocessor-only at this point

powerpc/openpic.c
- the number of CPUs reported is 1 too many on psim's openpic

Reviewed by: nwhitehorn
MFC after: 1 week (openpic part)


193579 06-Jun-2009 raj

Initial version of the sec(4) driver for the integrated security engine found
in Freescale system-on-chip devices.

The following algorithms and schemes are currently supported:
- 3DES, AES, DES
- MD5, SHA1, SHA256, SHA384, SHA512

Reviewed by: philip
Obtained from: Freescale, Semihalf


193578 06-Jun-2009 raj

Provide 64-bit big endian bus space operations for PowerPC. They are required
for the upcoming sec(4) driver.

Submitted by: Piotr Ziecik
Obtained from: Semihalf


193492 05-Jun-2009 raj

Discover and handle the number of E500 CPUs in run time.


193489 05-Jun-2009 raj

Fill PTEs covering kernel code and data.

Without this fix pte_vatopa() was not able to retrieve physical address of
data structures inside kernel, for example EFAULT was reported while acessing
/dev/kmem ('netstat -nr').

Submitted by: Piotr Ziecik
Obtained from: Semihalf


193334 02-Jun-2009 rwatson

Remove MAC kernel config files and add "options MAC" to GENERIC, with the
goal of shipping 8.0 with MAC support in the default kernel. No policies
will be compiled in or enabled by default, but it will now be possible to
load them at boot or runtime without a kernel recompile.

While the framework is not believed to impose measurable overhead when no
policies are loaded (a result of optimization over the past few months in
HEAD), we'll continue to benchmark and optimize as the release approaches.
Please keep an eye out for performance or functionality regressions that
could be a result of this change.

Approved by: re (kensmith)
Obtained from: TrustedBSD Project


193159 31-May-2009 nwhitehorn

Provide an analogous sysctl to hw.acpi.acline (dev.pmu.0.acline) to
determine whether the computer is plugged in to mains power.


193156 31-May-2009 nwhitehorn

Introduce support for cpufreq on PowerPC with the dynamic frequency
switching capabilities of the MPC7447A and MPC7448.


193144 31-May-2009 marcel

Mark the cascaded AT interrupt handler as MP safe to avoid having
it grab Giant. The next step would be to make it a filter.


192795 26-May-2009 raj

Set PG_WRITEABLE in Book-E pmap_enter[_locked] if it creates a mapping that
permits write access. This is similar to r192671.

Pointed out and reviewed by: alc


192533 21-May-2009 raj

Improve style(9), clean up.


192532 21-May-2009 raj

Initial support for SMP on PowerPC MPC85xx.

Tested with Freescale dual-core MPC8572DS development system.

Obtained from: Freescale, Semihalf


192531 21-May-2009 raj

Skip interleaved RAM target on MPC85xx during renitialization of the local
access windows. This eliminates hangs on systems which are configured to use
interleaved mode: prior to this fix we were simply cutting ourselves from
access to the main memory in this case.

Obtained from: Freescale, Semihalf


192323 18-May-2009 marcel

Add cpu_flush_dcache() for use after non-DMA based I/O so that a
possible future I-cache coherency operation can succeed. On ARM
for example the L1 cache can be (is) virtually mapped, which
means that any I/O that uses temporary mappings will not see the
I-cache made coherent. On ia64 a similar behaviour has been
observed. By flushing the D-cache, execution of binaries backed
by md(4) and/or NFS work reliably.
For Book-E (powerpc), execution over NFS exhibits SIGILL once in
a while as well, though cpu_flush_dcache() hasn't been implemented
yet.

Doing an explicit D-cache flush as part of the non-DMA based I/O
read operation eliminates the need to do it as part of the
I-cache coherency operation itself and as such avoids pessimizing
the DMA-based I/O read operations for which D-cache are already
flushed/invalidated. It also allows future optimizations whereby
the bcopy() followed by the D-cache flush can be integrated in a
single operation, which could be implemented using on-chips DMA
engines, by-passing the D-cache altogether.


192110 14-May-2009 raj

Improve style(9)


192109 14-May-2009 raj

PowerPC common SMP startup and time base rework.

- make mftb() shared, rewrite in C, provide complementary mttb()
- adjust SMP startup per the above, additional comments, minor naming
changes
- eliminate redundant TB defines, other minor cosmetics

Reviewed by: marcel, nwhitehorn
Obtained from: Freescale, Semihalf


192067 14-May-2009 nwhitehorn

Factor out platform dependent things unrelated to device drivers into a
new platform module. These are probed in early boot, and have the
responsibility of determining the layout of physical memory, determining
the CPU timebase frequency, and handling the zoo of SMP mechanisms
found on PowerPC.

Reviewed by: marcel, raj
Book-E parts by: raj


191954 10-May-2009 kuriyama

- Use "device\t" and "options \t" for consistency.


191455 24-Apr-2009 raj

Zero PCB during early AIM PowerPC init.

When memory is not zero'ed by firmware, uninitialized PCB can have bogus
contents, which appear as a saved onfault condition, Altivec context to
restore etc. and lead to corruption/crashes. This commit fixes such issues.

Submitted by: Michal Mazur arg ! semihalf dot com
Tested by: Andreas Tobler andreast-list ! fgznet dot ch


191450 24-Apr-2009 marcel

Add suppport for ISA and ISA interrupts to make the ATA
controller in the VIA southbridge functional in the CDS
(Configurable Development System) for MPC85XX.
The embedded USB controllers look operational but the
interrupt steering is still wrong.


191447 24-Apr-2009 marcel

Reimplement bs_be_rs_{1|2|4} and bs_le_rs_{1|2|4} by not
calling the inline functions in <machine/pio.h> and do
not add synchronization. Implement bs_gen_barrier() as
eieio and sync.


191446 24-Apr-2009 marcel

Remove PTE_FAKE and PTE_ISFAKE().


191445 24-Apr-2009 marcel

Remove PTE_ISFAKE. While here remove code
between "#if 0" and "#endif".


191380 22-Apr-2009 raj

Eliminate redundant setting of HID0_EMCP.


191378 22-Apr-2009 raj

Minor style consistency fix.


191376 22-Apr-2009 raj

Provide cpu_throw() for Book-E. Adjust cpu_switch() towards ULE support.

Obtained from: Freescale, Semihalf


191375 22-Apr-2009 raj

Centralize setting HID0/1 for E500. Rename HID defines which are specific
to E500 rather than shared within Book-E family.

Obtained from: Freescale, Semihalf


191363 21-Apr-2009 marcel

Lower VM_MAX_KERNEL_ADDRESS to 0xf8000000. We actually have
devices below CCSRBAR_VA, which overlap with KVA if that's
out limit.


191362 21-Apr-2009 marcel

o Properly set ksym_start & ksym_end when options DDB is set.
Include opt_ddb.h for that. Now you can actually boot with
-d and set breakpoints using function names.
o Make sure to include opt_msgbuf.h.
o Carve out the first 1MB of physical memory. The MPC85xx has
DMA problems with addresses below 1MB. Ideally busdma knows
how to avoid allocating below 1MB for MPC85xx, but that
requires a bit more work. For now, ignore the 1MB of DRAM.


191309 20-Apr-2009 rwatson

Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]

Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]

Suggested by: bde [1], jhb [2]
MFC after: 2 weeks


191307 20-Apr-2009 raj

Provide locking for PowerPC interrupt sources config.

Reviewed by: attilio


191278 19-Apr-2009 rwatson

Add description and cautionary note regarding CACHE_LINE_SIZE.

MFC after: 2 weeks
Suggested by: alc


191276 19-Apr-2009 rwatson

For each architecture, define CACHE_LINE_SHIFT and a derived
CACHE_LINE_SIZE constant. These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.

Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).

MFC after: 2 weeks
Discussed on: arch@


191261 19-Apr-2009 nwhitehorn

Fix a typo in the SRR1 comparison for program exceptions. While here,
replace magic numbers with constants to keep this from happening again.

Without this fix, some programs would occasionally get SIGTRAP instead
of SIGILL on an illegal instruction. This affected Altivec detection
in pixman, and possibly other software.

Reported by: Andreas Tobler
MFC after: 1 week


191039 14-Apr-2009 nwhitehorn

Changing the overflow trap to use bla to branch to dbtrap in r190946 was
bogus. Revert to a branch that does not set LR. It's been a long week...


190953 12-Apr-2009 nwhitehorn

Rework the way we get the cacheline size. Instead of having a table of
CPUs known to use 128 byte cache lines and defaulting to 32, use the dcbz
instruction to measure it. Also make dcbz behave the way you would
expect on PPC 970.


190946 11-Apr-2009 nwhitehorn

Fix recognition of kernel-mode traps that pass through the KDB trap handler
but do not actually invoke KDB. This includes recoverable machine checks
encountered in kernel mode.

This patch causes machines with Grackle host-PCI bridges to be able to
correctly enumerate them again.

MFC after: 3 days


190750 05-Apr-2009 nwhitehorn

Fix the build when KDB is disabled. The second instance of rfi in
trap_subr.S that is patched at runtime to rfid on 64-bit systems
is inside KDB-specific code, so don't patch it without KDB.


190747 05-Apr-2009 nwhitehorn

Add an Open Firmware access module for real-mode OF accesses to the PowerPC
build. This is required for the IBM Mambo simulator, as well as a variety
of non-Apple PowerPC hardware.


190708 05-Apr-2009 dchagin

Fix KBI breakage by r190520 which affects older linux.ko binaries:

1) Move the new field (brand_note) to the end of the Brandinfo structure.
2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer
is valid.
3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old
modules won't have the flag set, so the new field brand_note would be
ignored.

Suggested by: jhb
Reviewed by: jhb
Approved by: kib (mentor)
MFC after: 6 days


190704 04-Apr-2009 marcel

Perform a dummy stwcx. when we switch contexts. The context
being switched out may hold a reservation. The stwcx. will
clear the reservation. This is architecturally recommended.

The scenario this addresses is as follows:
1. Thread 1 performs a lwarx and as such holds a reservation.
2. Thread 1 gets switched out (before doing the matching
stwcx.) and thread 2 is switched in.
3. Thread 2 performs a stwcx. to the same reservation granule.
This will succeed because the processor has the reservation
even though thread 2 didn't do the lwarx.

Note that on some processors the address given the stwcx. is
not checked. On these processors the mere condition of having
a reservation would cause the stwcx. to succeed, irrespective
of whether the addresses are the same. The dummy stwcx. is
especially important for those processors.


190703 04-Apr-2009 marcel

Add sysarch.h. It's included by drm(4).


190702 04-Apr-2009 marcel

First round of cleanups. There's a lot of NetBSDism in this header.


190701 04-Apr-2009 marcel

Implement kernel core dump support for Book-E processors.
Both raw physical memory dumps and virtual minidumps are
supported. The default being minidumps.

Obtained from: Juniper Networks


190684 04-Apr-2009 marcel

PowerPC, meet kernel core dumps. The support is based
on a generic dumper that creates an ELF core file and
uses PMAP functions to scan and iterate over memory
chunks, as well as handle memory mappings used during
dumping.
the PMAP layer can choose to return physical memory
chunks or virtual memory chunks. For minidumps, the
chunks should be virtual.

The default MMU I/F implementation for the scan_md()
method returns NULL. Thus, when a PMAP implementation
does not implement the required methods, an empty
core file is created. Here, empty means having an ELF
header only.

Obtained from: Juniper Networks


190681 04-Apr-2009 nwhitehorn

Add support for 64-bit PowerPC CPUs operating in the 64-bit bridge mode
provided, for example, on the PowerPC 970 (G5), as well as on related CPUs
like the POWER3 and POWER4.

This also adds support for various built-in hardware found on Apple G5
hardware (e.g. the IBM CPC925 northbridge).

Reviewed by: grehan


190403 25-Mar-2009 nwhitehorn

Disable ATA DMA for ATAPI devices for now. Apparently, certain revisions
of this controller, in combination with certain ATAPI devices and phases
of the moon, will cause DMA operations for ATAPI to fail.


190100 19-Mar-2009 thompsa

Remove the uscanner(4) driver, this follows the removal of the kernel scanner
driver in Linux 2.6. uscanner was just a simple wrapper around a fifo and
contained no logic, the default interface is now libusb (supported by sane).

Reviewed by: HPS


189926 17-Mar-2009 kib

Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer
to the full path of the image that is being executed.
Increase AT_COUNT.

Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.

Reviewed by: kan


189771 13-Mar-2009 dchagin

Implement new way of branding ELF binaries by looking to a
".note.ABI-tag" section.

The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.

Move code which fetch osreldate for ELF binary to check_note() handler.

PR: 118473
Approved by: kib (mentor)


189757 13-Mar-2009 raj

Make MPC85xx LAW handling and reset routines aware of the MPC8548 variant.

Inspired by discussion with Alexey V Fedorov on freebsd-powerpc@.


189675 11-Mar-2009 nwhitehorn

Change the PVO zone for fictitious pages to the unmanaged PVO zone, to match
the unmanaged flag set in the PVO attributes. Without doing this,
pmap_remove() could try to remove fictitious pages (like those created
by mmap of physical memory) from the wrong UMA zone, causing a panic.

Reported by: Justin Hibbits
MFC after: 1 week


189170 28-Feb-2009 ed

Add memmove() to the kernel, making the kernel compile with Clang.

When copying big structures, LLVM generates calls to memmove(), because
it may not be able to figure out whether structures overlap. This caused
linker errors to occur. memmove() is now implemented using bcopy().
Ideally it would be the other way around, but that can be solved in the
future. On ARM we don't do add anything, because it already has
memmove().

Discussed on: arch@
Reviewed by: rdivacky


189101 27-Feb-2009 raj

Prefer register usage style to be more consistent with the rest of the
trap_subr.S code.


189100 27-Feb-2009 raj

Make Book-E debug register state part of the PCB context.

Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).

To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.

Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: marcel


188951 23-Feb-2009 nwhitehorn

Fix comment: we write the trap vector to SPRG3, not SPRG0.


188944 23-Feb-2009 thompsa

Change over the usb kernel options to the new stack (retaining existing
naming). The old usb stack can be compiled in my prefixing the name with 'o'.


188860 20-Feb-2009 nwhitehorn

Add Altivec support for supported CPUs. This is derived from the FPU support
code, and also reducing the size of trapcode to fit inside a 32 byte handler
slot.

Reviewed by: grehan
MFC after: 2 weeks


188711 17-Feb-2009 raj

Additional features for the tsec(4) Ethernet driver.

- interrupt coalescing
- polling
- jumbo frames
- multicast
- VLAN tagging

The enhanced version of the chip (eTSEC) can also take advantage of:

- TCP/IP checksum calculation h/w offloading

Obtained from: Freescale, Semihalf


188665 15-Feb-2009 thompsa

Add uslcom to the build too.

Reminded by: Michael Butler


188660 15-Feb-2009 thompsa

Switch over GENERIC kernels to USB2 by default.

Tested by: make universe


187692 25-Jan-2009 nwhitehorn

Add support for the I2S and davbus audio controllers found in Apple PowerPC
hardware.

Submitted by: Marco Trillo


187691 25-Jan-2009 nwhitehorn

Fix a race condition where interrupts set up after boot could be enabled in
the PIC before the interrupt handler was set. If the interrupt triggered in
that window, then the interrupt vector would be disabled.

Reported by: Marco Trillo


187473 20-Jan-2009 nwhitehorn

Fix a race condition in kiic(4) made possible by the way the device's STOP
condition is sent. We used to put the bus in the STOP state, but returned
without waiting for that to actually occur.

Submitted by: Marco Trillo


187455 19-Jan-2009 nwhitehorn

Provide a device description for macio-attached ATA cells.


187262 15-Jan-2009 nwhitehorn

Driver for Apple Keywest I2C controllers found in MacIO ASICs. Used for
power and thermal control, as well as GPIOs on Xserves and controlling
sound codecs for Apple built-in audio.

Submitted by: Marco Trillo
Obtained from: NetBSD


187153 13-Jan-2009 raj

Clean up BookE low-level exceptions code.

Improve comments, fix style(9) and typos, unify separators.

Obtained from: Freescale, Semihalf


187151 13-Jan-2009 raj

Clean up BookE pmap.

Improve comments, eliminate redundant debug output, fix style(9) and other
minor tweaks for code readability.

Obtained from: Freescale, Semihalf


187149 13-Jan-2009 raj

Rework BookE pmap towards multi-core support.

o Eliminate tlb0[] (a s/w copy of TLB0)
- The table contents cannot be maintained reliably in multiple MMU
environments, where asynchronous events (invalidations from other cores)
can change our local TLB0 contents underneath.
- Simplify and optimize TLB flushing: system wide invalidations are
performed using tlbivax instruction (propagates to other cores), for
local MMU invalidations a new optimized routine (assembly) is introduced.

o Improve and simplify TID allocation and management.
- Let each core keep track of its TID allocations.
- Simplify TID recycling, eliminate dead code.
- Drop the now unused powerpc/booke/support.S file.

o Improve page tables management logic.

o Simplify TLB1 manipulation routines.

o Other improvements and polishing.

Obtained from: Freescale, Semihalf


187071 12-Jan-2009 nwhitehorn

Some early Macintosh GPIO controllers don't provide reg properties for
interrupt-only GPIOs. Honor this, and allow interrupt attachment, but not
read/write access for such devices.

Reported by: Niels Eliasen


186805 06-Jan-2009 nwhitehorn

Add a new quirk type so that the MacIO driver will assign memory resources
belonging to a devices children, in analogy to the way we handle interrupts
for SCC serial devices. This is required to counteract overly deep nesting
on onboard audio devices.

Submitted by: Marco Trillo


186728 03-Jan-2009 nwhitehorn

Fix the OFW interrupt map parser to use its own idea of the number of interrupt
cells in the map, instead of using a value passed to it and then panicing if it
disagrees. This fixes interrupt map parsing for PCI bridges on some Apple
Uninorth PCI controllers.

Reported by: marcel
Tested on: G4 iBook, Sun Ultra 5


186347 20-Dec-2008 nwhitehorn

Modularize the Open Firmware client interface to allow run-time switching
of OFW access semantics, in order to allow future support for real-mode
OF access and flattened device frees. OF client interface modules are
implemented using KOBJ, in a similar way to the PPC PMAP modules.

Because we need Open Firmware to be available before mutexes can be used on
sparc64, changes are also included to allow KOBJ to be used very early in
the boot process by only using the mutex once we know it has been initialized.

Reviewed by: marius, grehan


186289 18-Dec-2008 raj

Minor spelling fix in E500 locore.


186288 18-Dec-2008 raj

Extend and improve MPC85XX Local Bus management.

- Make LBC resources management self-contained: introduce explicit LBC
resources definition (much like the OCP), provide dedicated rman for LB mem
space.

- Full configuration of an LB chip select device: program LAW and BR/OR, map
into KVA, handle all LB attributes (bus width, machine select, ecc,
write protect etc).

- Factor out LAW manipulation routines into shared code, adjust OCP area
accordingly.

- Other LBC fixes and clean-ups.

Obtained from: Semihalf


186230 17-Dec-2008 raj

Fix E500 cache invalidation routines.

When invalidating the i/d-cache we need to wait until the core complex is
really finished with the operation.

Obtained from: Semihalf


186229 17-Dec-2008 raj

Rework E500 locore.

- split bootstrap code into more modular routines, which will also be used for
the non-booting cores
- clean up registers usage
- improve comments to better reflect reality
- eliminate dead or redundant code
- other minor fixes

This refactoring is a preliminary step before importing dual-core (MPC8572)
support.

Obtained from: Freescale, Semihalf


186228 17-Dec-2008 raj

Minor clean up of BookE/MPC85XX: iprove naming and style(9).


186227 17-Dec-2008 raj

Improve MPC85XX helper routines.

- Move CCSR accessors to the shared MPC85XX area
- Simplify SVR version subfield handling
- Adjust OCP


186212 17-Dec-2008 imp

AT_DEBUG and AT_BRK were OBE like 10 years ago, so retire them.

Reviewed by: peter


186128 15-Dec-2008 nwhitehorn

Adapt parts of the sparc64 Open Firmware bus enumeration code (in particular,
the code for parsing interrupt maps) to PowerPC and reflect their new MI
status by moving them to the shared dev/ofw directory.

This commit also modifies the OFW PCI enumeration procedure on PowerPC to
allow the bus to find non-firmware-enumerated devices that Apple likes to add,
and adds some useful Open Firmware properties (compat and name) to the pnpinfo
string of children on OFW SBus, EBus, PCI, and MacIO links. Because of the
change to PCI enumeration on PowerPC, X has started working again on PPC
machines with Grackle hostbridges.

Reviewed by: marius
Obtained from: sparc64


186055 13-Dec-2008 nwhitehorn

Allow OFW syscons to restore itself when the X server exits or there is a VT switch
by redoing the Open Firmware card initialization calls in ofwfb_set_mode(). This
uses the same trick (setting V_ADP_MODECHANGE) to arrange this as machfb(4) and
creatorfb(4).


186050 13-Dec-2008 nwhitehorn

Add support for a console mouse pointer on Open Firmware syscons.

MFC after: 7.1-RELEASE


186046 13-Dec-2008 nwhitehorn

Use a static free packet queue instead of using malloc() to allocate new ADB packets.
This fixes some locking problems.


185782 09-Dec-2008 nwhitehorn

Add the ability to control the sleep LED with led(4). Adding this fairly
useless feature gives us a reasonably complete PMU implementation.


185757 08-Dec-2008 nwhitehorn

Clean up the mac GPIO interface a little. Also remove bogus copyright
and 3rd license clause.

Submitted by: Marco Trillo


185755 08-Dec-2008 nwhitehorn

Accidentally left ADB out of the PowerPC NOTES file during initial import.


185754 08-Dec-2008 nwhitehorn

Add facilities to pmu(4) to interrogate battery status on Apple PowerPC
laptops. This includes battery presence detection, charging status, current
and voltage readouts, and charge level indication. The sysctl interface
is somewhat ACPI-like.


185727 07-Dec-2008 nwhitehorn

Add support for automated reboot after power failure on Apple Core99 machines
(G3 laptops, all G4 machines, early G5s, G5 Xserves). The relevant sysctl
is named dev.pmu.0.server_mode for mental compatibility with Linux.


185724 06-Dec-2008 nwhitehorn

Fix some nasty race conditions in the VIA-CUDA driver that ended up preventing
my right mouse button and keyboard LEDs from working due to mangled
configuration packets. Fixed several other races and associated problems in the
main ADB stack that were exposed while fixing this.


185567 02-Dec-2008 ed

Remove "[KEEP THIS!]" from COMPAT_43TTY. It's not really that important.

Sgtty is a programming interface that has been replaced by termios over
the years. In June we already removed <sgtty.h>, which exposes the
ioctl()'s that are implemented by this interface. The importance of this
flag is overrated right now.


185189 22-Nov-2008 marcel

Unbreak previous commit.


185169 22-Nov-2008 kib

Add sv_flags field to struct sysentvec with intention to provide description
of the ABI of the currently executing image. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures to determine ABI features.

Discussed with: dchagin, imp, jhb, peter


185162 22-Nov-2008 kmacy

- bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.


185005 16-Nov-2008 marcel

Define LDBL_EPSILON, LDBL_MAX and LDBL_MIN as long double constants.

Submitted by: Andreas Tobler <andreast-list@fgznet.ch>
Reviewed by: das@


184486 30-Oct-2008 sobomax

Fix compilation in the case when kernel doesn't have KDB ebabled.
subr_kdb.c still references breakpoint() in this case.


184473 30-Oct-2008 nwhitehorn

Fix some possible infinite loops in the ADB code, and remove some hacks
that were inserted in desperation during bring-up. In addition, move ADB bus
enumeration and child attachment to when interrupts are available.


184460 30-Oct-2008 marcel

Add support for little-endian compilations to this file.


184429 28-Oct-2008 nwhitehorn

DBDMA can transfer a maximum of 64K - 1 bytes per descriptor, as the byte
count field is 16 bits. Inform ATA of this fact.

Reported by: Marco Trillo


184382 27-Oct-2008 nwhitehorn

Clean up some magic numbers in the DBDMA code by replacing them with
appropriately defined constants.

Suggested by: gnn


184319 27-Oct-2008 marcel

Add support for kernel profiling for both AIM and BookE.

Obtained from: Juniper Networks, Inc (BookE support).


184318 27-Oct-2008 marcel

Remove unused declarations (interrupt_vector_{base|top}).


184316 27-Oct-2008 marcel

Declare btext and etext. Needed by sys/kern/subr_prof.c for
for kernel profiling.


184314 27-Oct-2008 nwhitehorn

Bring Kauai ATA driver in line with Macio ATA by reading the PIO config reg
to set the initial PIO mode instead of assuming PIO4. There are still a few
nagging issues:

- There are some problems with 64 K DMA transfers waiting on lower level
changes.

- ATAPI DMA is broken on Marcel's Mac Mini because we need an ATA SELECT hook
propagated up to individual drivers for hardware without timing registers for
each ATA channel.


184299 26-Oct-2008 nwhitehorn

Add ADB support. This provides support for the external ADB bus on the PowerMac
G3 as well as the internal ADB keyboard and mice in PowerBooks and iBooks. This
also brings in Mac GPIO support, for which we should eventually have a better
interface.

Obtained from: NetBSD (CUDA and PMU drivers)


184252 25-Oct-2008 marcel

Enable the cfi(4) driver.


184250 25-Oct-2008 marcel

Add a driver for the Local Bus Controller.

Obtained from: Juniper Networks, Inc.


184249 25-Oct-2008 marcel

Assign 0xff800000-0xffffffff to the LBC controller. That's where
the NOR flash lives by default.


184244 25-Oct-2008 marcel

In mmu_booke_mapdev(), handle mappings that cannot be represented
by a single TLB entry. The boot ROM on the MPC85555CDS is 8MB, for
example, and in order to map that we need 2 4MB TLB entries.


183901 15-Oct-2008 nwhitehorn

Prevent the OF syscons module from trying to attach to real devices on the
nexus by only attaching to a device with no OF node.


183882 14-Oct-2008 nwhitehorn

Convert PowerPC AIM PCI and nexus busses to standard OFW bus interface. This
simplifies certain device attachments (Kauai ATA, for instance), and makes
possible others on new hardware.

On G5 systems, there are several otherwise standard PCI devices
(Serverworks SATA) that will not allow their interrupt properties to be
written, so this information must be supplied directly from Open Firmware.

Obtained from: sparc64


183439 28-Sep-2008 marius

Remove ipi_all() and ipi_self() as the former hasn't been used at
all to date and the latter also is only used in ia64 and powerpc
code which no longer serves a real purpose after bring-up and just
can be removed as well. Note that architectures like sun4u also
provide no means of implementing IPI'ing a CPU itself natively
in the first place.

Suggested by: jhb
Reviewed by: arch, grehan, jhb


183437 28-Sep-2008 nwhitehorn

Unbreak support for G4s without an L3 cache. L3 cache support was introduced
with, and limited to, the Motorola/Freescale 745x family.

Reported by: Marco Trillo


183411 27-Sep-2008 nwhitehorn

Expand the DBDMA API to allow setting device-dependent control bits. While
here, clean up and document this a little.

Submitted by: Marco Trillo
MFC after: 1 week


183409 27-Sep-2008 nwhitehorn

Add DMA support for Apple built-in ATA controllers.

Tested by: grehan, marcotrillo@gmail.com
MFC after: 1 month


183397 27-Sep-2008 ed

Replace all calls to minor() with dev2unit().

After I removed all the unit2minor()/minor2unit() calls from the kernel
yesterday, I realised calling minor() everywhere is quite confusing.
Character devices now only have the ability to store a unit number, not
a minor number. Remove the confusion by using dev2unit() everywhere.

This commit could also be considered as a bug fix. A lot of drivers call
minor(), while they should actually be calling dev2unit(). In -CURRENT
this isn't a problem, but it turns out we never had any problem reports
related to that issue in the past. I suspect not many people connect
more than 256 pieces of the same hardware.

Reviewed by: kib


183322 24-Sep-2008 kib

Change the static struct sysentvec and struct Elf_Brandinfo initializers
to the C99 style. At least, it is easier to read sysent definitions
that way, and search for the actual instances of sigcode etc.

Explicitely initialize sysentvec.sv_maxssiz that was missed in most
sysvecs.

No objection from: jhb
MFC after: 1 month


183319 24-Sep-2008 nwhitehorn

Allow the cacheline size on PowerPC to be set at runtime. This is essential for
supporting 64-bit CPUs, which often have 128-byte cache lines instead of the
standard 32.


183317 23-Sep-2008 sobomax

Improve rev 183168, so that if /chosen/stdout is connected to the serial
port by OF the syscons won't take over console. Only attach syscons to "screen"
if /chosen/stdout is not connected, which could be the case when loader(8)
is booted directly from the OF. This fixes Marcel's Xserver.

Reported by: marcel


183290 23-Sep-2008 nwhitehorn

In preparation for PowerPC G5 support, allow PVO objects to contain page
table entries for both the 32-bit and 64-bit AIM MMUs.


183288 23-Sep-2008 nwhitehorn

Change the DBDMA API to allow DBDMA registers in a subregion of a resource. This is necessary to allow future support of DMA for the various Apple on-board ATA controllers.

MFC after: 1 week


183262 22-Sep-2008 nwhitehorn

Unbreak G3 support. G3 processors don't have an L3 cache, so we shouldn't try to program it.

Approved by: marcel (mentor)


183168 19-Sep-2008 sobomax

When attaching framebuffer to "/chosen/stdout" node fails, try attaching
to "screen" node directly. The problem is that by default OF on some (all?)
Macs either doesn't provide "/chosen/stdout" or redirects it somewhere,
unless you boot in manual mode via CMD-ALT-O-F. It's nice to see normal
FreeBSD boot output instead of blank gray screen.


183094 16-Sep-2008 marcel

o When not making a translation cache-inhibit and guarded (PTE_I|PTE_G)
make it memory-coherency enforced (PTE_M). This is required for SMP
to work.
o Serialize tlbie operations and implement the tlbie operation in a
function called tlbie(). Hardware can end up in a live-lock if
between the tlbsync and subsequent sync on one processor another
processor executes a tlbie or tlbsync.
o Eliminate the following defines:
TLBIE, TLBSYNC, SYNC and EIEIO
Use either inline assembly statements or inline functions defined
in <machine/cpufunc.h>


183090 16-Sep-2008 marcel

Rewrite cpudep_ap_bootstrap(). We now enable L3, L2, L1D and L1I
caches if not yet enabed. This is required for coherency and
atomic operations to work, not to mention performance. We use the
L2 and L3 cache settings of the BSP to configure the APs caches.
Can't be bad.

Program NAP and not DOZE. DOZE is present only on earlier CPUs
and the bit is reserved on the MPC7441 & MPC7451. NAP will do
bus snooping to keep caches coherent.

Program the PIR with the cpuid. This may not be necessary...


183089 16-Sep-2008 marcel

o In decr_get_timecount() only read the low timebase register.
We're only returning a 32-bit counter.
o In decr_intr(), manually perform LICM, so that we don't test
a loop invariant condition inside a loop.
o Include <machine/smp.h>


183088 16-Sep-2008 marcel

Set pcpup->pc_curthread and pcpup->pc_curpcb before calling
pmap_activate. While pmap_activate doesn't need either, we
do need a valid curthread if we enable KTR_PMAP.


183083 16-Sep-2008 marcel

o Synchronize the APs timebase and decrementer values with the BSP.
o Don't set/get the PIR register. It's CPU dependent.
o Also initialize pcpup->pc_curpcb, in case it's dereferenced.


183081 16-Sep-2008 marcel

In powerpc_get_pcpup(), make the inline assembly statement
volatile so that the compiler won't perform CSE. For SMP,
this may result in us accessing the wrong PCPU and as such
results in a bogus curthread value.

Note that getting curthread is not quite MP-safe in the sense
that it requires two instructions that aren't performed
atomically. The first instruction gets the address of the PCPU
structure and the second instruction dereferences that pointer
to get curthread. If a thread is switched-out in between these
instructions and switched-in on a different CPU, we still get
the wrong curthread.


183060 16-Sep-2008 marcel

Remove the tracing from the AP startup. The AP is known
to start and the tracing can interfere with AP startup.
Instead, use the available space in the reset vector
for the initial stack.


183031 15-Sep-2008 marcel

o Remove SPR_TSR & SPR_TCR for AIM.
o Remove SPR_HID2.
o Add more SPR_L3CR bit definitions.


183030 15-Sep-2008 marcel

Dont worry about PSL_RI (restartable interrupt indicator) in
common PowerPC code when all we want to achieve is to enable
external interrupts. We can set PSL_RI at any time before we
allow interrupts and/or exceptions, so move it to the AIM
specific initialization and do it when we also set PSL_ME
(machine check enable).


183029 15-Sep-2008 marcel

Rename cpu_config_l2cr() to cpu_print_cacheinfo(). We're not
configuring the L2 cache on the BSP. Nor the L3 cache. We
merely print the settings.

Save the L2 and L3 cache configuration in global values so
that we know how to configure the cache on APs.


183028 14-Sep-2008 marcel

Remove debugging code.


182580 31-Aug-2008 marcel

Trace interrupts with KTR_INTR.


182573 31-Aug-2008 marcel

Remove redundant KTR statements.


182571 31-Aug-2008 marcel

Trace all PMAP calls using KTR_PMAP.


182503 31-Aug-2008 marcel

Remove restore_intr(). We have intr_restore()...


182487 30-Aug-2008 marcel

In db_show_mdpcpu(), print MD fields.


182486 30-Aug-2008 marcel

Whitespace fixes.


182485 30-Aug-2008 marcel

Call powerpc_sync() instead of using an asm statement.


182484 30-Aug-2008 marcel

Add powerpc_sync() as an inline function.


182483 30-Aug-2008 marcel

Don't clear PSL_RI. Disabling external interrupts
doesn't make exceptions unrecoverable.


182362 28-Aug-2008 raj

Move initialization of tlb0, ptbl_bufs and kernel_pdir regions after we are
100% sure that TLB1 mapping covers for them; previously we could lock the CPU
with an untranslated references.

Obtained from: Semihalf


182198 26-Aug-2008 raj

Improve kernel stack handling on e500.

- Allocate thread0.td_kstack in pmap_bootstrap(), provide guard page
- Switch to thread0.td_kstack as soon as possible i.e. right after return
from e500_init() and before mi_startup() happens
- Clean up temp stack area
- Other minor cosmetics in machdep.c

Obtained from: Semihalf


181905 20-Aug-2008 ed

Integrate the new MPSAFE TTY layer to the FreeBSD operating system.

The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.

If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).

The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan


181875 19-Aug-2008 jhb

Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after: 1 week


181233 03-Aug-2008 ed

Disconnect drivers that haven't been ported to MPSAFE TTY yet.

As clearly mentioned on the mailing lists, there is a list of drivers
that have not been ported to the MPSAFE TTY layer yet. Remove them from
the kernel configuration files. This means people can now still use
these drivers if they explicitly put them in their kernel configuration
file, which is good.

People should keep in mind that after August 10, these drivers will not
work anymore. Even though owners of the hardware are capable of getting
these drivers working again, I will see if I can at least get them to a
compilable state (if time permits).


180359 07-Jul-2008 delphij

Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out
of the box.


180209 03-Jul-2008 peter

Exclude .cvsignore files from $FreeBSD$ checking


179991 25-Jun-2008 ed

Remove the unused M_MEMDEV from the kernel.

The M_MEMDEV memory allocation pool does not seem to be used. We can
live without it.

Approved by: philip (mentor)


179990 25-Jun-2008 ed

Remove the unused major/minor numbers from iodev and memdev.

Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by: philip (mentor)


179746 12-Jun-2008 kevlo

Return an error code rather than ENXIO when both rman_init() and
rman_manage_region() failed.

Reviewed by: marcel


179729 11-Jun-2008 wkoszek

Fix a typo in a comment.


179646 08-Jun-2008 marcel

Move bm(4) from the sys/conf/NOTES to sys/powerpc/conf/NOTES.
The driver applies to PowerPC only.


179645 07-Jun-2008 marcel

Add support for the Apple Big Mac (BMAC) Ethernet controller,
found on various Apple G3 models.

Submitted by: Nathan Whitehorn


179644 07-Jun-2008 marcel

Add support for Apple's Descriptor-Based DMA (DBDMA) engine. The DMA
engine is usful to various existing drivers, such as ata(4) and scc(4),
and is used bhy the soon to be added bm(4).

Submitted by: Nathan Whitehorn


179533 04-Jun-2008 grehan

Add link register to fatal trap printout to better diagnose NULL
function pointer derefs.


179254 23-May-2008 marcel

Invalidate the TLB in pmap_cpu_bootstrap(), so that it also happens
on the APs.


179229 23-May-2008 alc

The VM system no longer uses setPQL2(). Remove it and its helpers.


179164 21-May-2008 obrien

Use the "options " spelling (vs. "options<TAB>") so that commented lines
line up nicely.


179081 18-May-2008 alc

Retire pmap_addr_hint(). It is no longer used.


179049 16-May-2008 attilio

Removed unused assembly offsets for structures digging.


178893 09-May-2008 alc

Add a stub for pmap_align_superpage() on machines that don't (yet)
implement pmap-level support for superpages.


178629 28-Apr-2008 marcel

The first argment of mtdbatu or mtibatu is part of the encoding.
It needs to be constant, so eliminate the loop and "hand-unroll".


178628 27-Apr-2008 marcel

MFp4: SMP support


178626 27-Apr-2008 marcel

Eliminate track_modified_needed(), better known as pmap_track_modified()
on other platforms. We no longer need it because we do not create managed
mappings within the clean submap.

Pointed out by: alc


178622 27-Apr-2008 marcel

MFp4: SMP support


178621 27-Apr-2008 marcel

Make sure tmpstk is aligned and make it 8KB in size -- not 8KB+16.


178618 27-Apr-2008 marcel

Remove mfsvr():
o The function is defined unconditionally but depends on SPR_SVR,
which is defined conditionally.
o spr.h defines mfspr() and mtspr(), which is no worse to use.


178599 26-Apr-2008 marcel

Take into account the size of the interrupt cell. It's determined
by the parent for interrupt resources. This corrects parsing of
the interrupts property.

With parsing of the property fixed, add all interrupts to the
resource list. Bump the max. number of interrupts from 5 to 6
as scc(4) attached to macio(4) has 6 interrupts (3 per channel).

Submitted by: Nathan Whitehorn <nathanw@uchicago.edu>


178597 26-Apr-2008 raj

Use RSTCR for resetting the MPC8572 (the old way does not apply).

Obtained from: Freescale, Semihalf


178596 26-Apr-2008 raj

Introduce a dedicated file for MPC85xx-specific routines. Move cpu_reset()
there, as it's not relevant to Book-E specification, but is an implementation
detail, directly dependent on the given SoC version.


178595 26-Apr-2008 raj

Improve handling of Local Access Windows on MPC85xx systems:

- detect number of LAWs in run time and initalize accordingly
- introduce decode windows target IDs used in MPC8572
- other minor updates

Obtained from: Freescale, Semihalf


178593 26-Apr-2008 raj

Move System Revision defines to a bit better place, add MPC8572 systems IDs.


178591 26-Apr-2008 raj

Enable NFSLOCKD for MPC85XX kernel to comply with recent NFS rework.


178471 25-Apr-2008 jeff

- Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
- Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
suspended in cpu specific states. This function can fail and cause the
scheduler to fall back to another mechanism (ipi).
- Implement support for mwait in cpu_idle() on i386/amd64 machines that
support it. mwait is a higher performance way to synchronize cpus
as compared to hlt & ipis.
- Allow selecting the idle routine by name via sysctl machdep.idle. This
replaces machdep.cpu_idle_hlt. Only idle routines supported by the
current machine are permitted.

Sponsored by: Nokia


178429 22-Apr-2008 phk

Now that all platforms use genclock, shuffle things around slightly
for better structure.

Much of this is related to <sys/clock.h>, which should really have
been called <sys/calendar.h>, but unless and until we need the name,
the repocopy can wait.

In general the kernel does not know about minutes, hours, days,
timezones, daylight savings time, leap-years and such. All that
is theoretically a matter for userland only.

Parts of kernel code does however care: badly designed filesystems
store timestamps in local time and RTC chips almost universally
track time in a YY-MM-DD HH:MM:SS format, and sometimes in local
timezone instead of UTC. For this we have <sys/clock.h>

<sys/time.h> on the other hand, deals with time_t, timeval, timespec
and so on. These know only seconds and fractions thereof.

Move inittodr() and resettodr() prototypes to <sys/time.h>.
Retain the names as it is one of the few surviving PDP/VAX references.

Move startrtclock() to <machine/clock.h> on relevant platforms, it
is a MD call between machdep.c/clock.c. Remove references to it
elsewhere.

Remove a lot of unnecessary <sys/clock.h> includes.

Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs.
XXX: should be kern.disable_rtc_set really, it's not MD.


178372 21-Apr-2008 phk

Make genclock standard on all platforms.

Thanks to: grehan & marcel for platform support on ia64 and ppc.


178367 21-Apr-2008 marcel

Switch to using genclock. Have nexus double as clock device for
now. While here, add a proper attach() method to nexus.

Requested by: phk


178265 17-Apr-2008 marcel

Simplify the pmap_zero_page family of functions by making use of
the fact that we have a 1:1 mapping by virtue of the BATs.
Eliminate the now unused moea_rkva_alloc(), moea_pa_map() and
moea_pa_unmap() functions.

Pointed out by: grehan.


178261 16-Apr-2008 marcel

Allocate a stack (with optional guard pages) for thread0 and
switch to it before calling mi_startup().


178182 13-Apr-2008 phk

Get rid of an empty RTC implementation and hook up genclock instead.


178092 11-Apr-2008 jeff

- Add the interrupt vector number to intr_event_create so MI code can
lookup hard interrupt events by number. Ignore the irq# for soft intrs.
- Add support to cpuset for binding hardware interrupts. This has the
side effect of binding any ithread associated with the hard interrupt.
As per restrictions imposed by MD code we can only bind interrupts to
a single cpu presently. Interrupts can be 'unbound' by binding them
to all cpus.

Reviewed by: jhb
Sponsored by: Nokia


178057 10-Apr-2008 marcel

Fix copy-n-paste typos in free text.


178030 09-Apr-2008 grehan

Include <sys/types.h> before <sys/systm.h> to get typedefs required
by new atomic.h. Fixes tinderbox LINT build.


178027 09-Apr-2008 marcel

Reimplement atomic_add, atomic_clear, atomic_set and atomic_subtract
so that all implemented variants have proper prototypes. The 8-bit,
16-bit and 64-bit variants are not implemented.

This really fixes the current build breakages caused by type casting
and struct aliasing rules.


178010 08-Apr-2008 marcel

Quick fix for the kernel build breakage in netgraph and the
aliasing warning in libthr. A more elaborate fix is in the
works that makes sure that all variants have proper inline
functions with proper types.


177940 05-Apr-2008 jhb

Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
Also, don't bother unmasking the interrupt in the post_filter case if
we never masked it. The INTR_FILTER case had been doing this by having
arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
both the 'post_filter' and 'post_ithread' hooks to match what the
non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
designed to do even in 5.x).

Glanced at by: piso
Reviewed by: marius
Requested by: marius [1], [5]
Tested on: amd64, i386, arm, sparc64


177885 03-Apr-2008 marcel

Align functions to 16-byte boundaries due to profiling granularity.


177884 03-Apr-2008 marcel

Set sc_psim so that the openpic core can correct the off-by-one
error in the number of IRQs that PSIM gives us.


177662 27-Mar-2008 dfr

Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.


177661 27-Mar-2008 jb

When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.


177642 26-Mar-2008 phk

The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
timer_spkr_acquire()
timer_spkr_release()
and
timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all. In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that. It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].


177325 17-Mar-2008 jhb

Simplify the interrupt code a bit:
- Always include the ie_disable and ie_eoi methods in 'struct intr_event'
and collapse down to one intr_event_create() routine. The disable and
eoi hooks simply aren't used currently in the !INTR_FILTER case.
- Expand 'disab' to 'disable' in a few places.
- Use function casts for arm and i386:intr_eoi_src() instead of wrapper
routines since to trim one extra indirection.

Compiled on: {arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
Tested on: {amd64,i386} x {FILTER, !FILTER}


177288 17-Mar-2008 marcel

Make remote GDB work for AIM processors. For BookE, the kernel
will have a special section, named .PPC.EMB.apuinfo, which will
tell GDB that a BookE processor is targeted and which will
result in GDB using a different register definition. In order
to support remote GDB for BookE, we need the GDB stub in the
kernel look for that section and use the BookE definitions.


177276 16-Mar-2008 pjd

Implement atomic_fetchadd_long() for all architectures and document it.

Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)


177253 16-Mar-2008 rwatson

In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation. This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after: 1 month
Discussed with: imp, rink


177181 14-Mar-2008 jhb

Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
code wishes to bind an interrupt source to an individual CPU. The MD
code may reject the binding with an error. If an assign_cpu function
is not provided, then the kernel assumes the platform does not support
binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
event is bound to a CPU. Only shared ithreads are bound. We currently
leave private ithreads for drivers using filters + ithreads in the
INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
an interrupt source and binds its interrupt event to the specified CPU.
MI code can currently (ab)use this by doing:

intr_bind(rman_get_start(irq_res), cpu);

however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
where the implementation in the x86 nexus(4) driver would end up calling
intr_bind() internally.

Requested by: kmacy, gallatin, jeff
Tested on: {amd64, i386} x {regular, INTR_FILTER}


177110 12-Mar-2008 raj

Obtain TSEC h/w address from the parent bus (OCP) and not rely blindly on what
might be currently programmed into the registers.

Underlying firmware (U-Boot) would typically program MAC address into the
first unit only, and others are left uninitialized. It is now possible to
retrieve and program MAC address for all units properly, provided they were
passed on in the bootinfo metadata.

Reviewed by: imp, marcel
Approved by: cognet (mentor)


177091 12-Mar-2008 jeff

Remove kernel support for M:N threading.

While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential. Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.


177068 11-Mar-2008 marcel

In intr_lookup(), when adding an IRQ to powerpc_intrs[], also
set a default name. If the IRQ is added as a consequence of
configurating the IRQ without there ever being a handler
assigned to it, we will not have a name. This breaks the
fragile intrcnt/intrnames logic.


176964 09-Mar-2008 marcel

Don't use in32() and out32() when writing to the CCSRBAR. The
in*() and out*() primitives should not be used, other than by
ISA drivers. In this case they were used for memory-mapped I/O
and were not even used in the spirit of the primitives.


176928 08-Mar-2008 marcel

Enable the D-cache and I-cache when not already enabled.
It so happens that U-Boot disables the D-cache when booting
an ELF image, so this change makes sure we run with the
D-cache enabled from now on. It shows too...

While here, remove the duplicate definition of the hw.model
sysctl.


176919 07-Mar-2008 marcel

For AIM, have cpu_idle() set MSR_POW when the powerpc_pow_enabled
variable is set. On my Mac Mini this puts the CPU in NAP mode when
the kernel is idle and, any technical or environmental reasons
aside, avoids that I have to listen to the fan all day :-)


176918 07-Mar-2008 marcel

Add support for the BUS_CONFIG_INTR() method to the platform and to
openpic(4). Make use of it in ocpbus(4). On the MPC85xxCDS, IRQ0:4
are active-low.


176877 06-Mar-2008 marcel

Add a catch-all for PCPU_MD_FIELDS. While we expect this to be
used in the kernel only (by virtue of checking for _KERNEL),
ports like lsof (part of gtop) cheat. It sets _KERNEL, but does
not set either AIM or E500. As such, PCPU_MD_FIELDS didn't get
defined and the build broke.
The catch-all is to define PCPU_MD_FIELDS with a dummy integer
when at the end of line we ended up without a definition for it.


176836 05-Mar-2008 marcel

o We don't have to keep track of the PIC, nor do we have to make sure
it's probed first. The PowerPC platform code deals with everything.
As such, probe devices in order of their location in the memory map.
o Refactor the ocpbus_alloc_resource for readability and make sure we
set the RID in the resource as per the new convention.


176832 05-Mar-2008 marcel

o Various fixes related to PCI Express:
- Even for the PCI Express host controller we need to use bus 0
for configuration space accesses to devices directly on the
host controller's bus.
- Pass the maximum number of slots to pci_ocp_init() because the
caller knows how many slots the bus has. Previously a PCI or
PCI-X bus underneath a PCI Express host controller would not
be enumerated properly.
o Pull the interrupt routing logic out of pci_ocp_init() and into
its own function. The logic is not quite right and is expected
to be a bit more complex.
o Fix/add support for PCI domains. The PCI domain is the unit
number as per other PCI host controller drivers. As such, we
can use logical bus numbers again and don't have to guarantee
globally unique bus numbers. Remove pci_ocp_busnr. Return the
highest bus number ito the caller of pci_ocp_init() now that
we don't have a global variable anymore.
o BAR programming fixes:
- Non-type0 headers have at most 1 BAR, not 0.
- First write ~0 to the BAR in question and then read back its
size.

Obtained from: Juniper Networks (mostly)


176782 04-Mar-2008 marcel

Also comment-out options MPC85XX. We don't define CCSRBAR_* without E500.


176780 04-Mar-2008 marcel

Comment-out cpu E500. We can't yet build it with AIM at the same time.


176779 04-Mar-2008 marcel

Add the pic_ipi method. While here, eliminate the unused openpic_ocpbus_softc
struct.


176777 03-Mar-2008 raj

Import the omitted gdb_machdep.c for PowerPC kernel.

Approved by: cognet (mentor)
MFp4: e500


176776 03-Mar-2008 raj

Connect MPC85XX to the PowerPC build.

The kernel config file is KERNCONF=MPC85XX, so the usual procedure applies:

1. make buildworld TARGET_ARCH=powerpc
2. make buildkernel TARGET_ARCH=powerpc TARGET_CPUTYPE=e500 KERNCONF=MPC85XX

This default config uses kernel-level FPU emulation. For the soft-float world
approach:

1. make buildworld TARGET_ARCH=powerpc TARGET_CPUTYPE=e500
2. disable FPU_EMU option in sys/powerpc/conf/MPC85XX
3. make buildkernel TARGET_ARCH=powerpc TARGET_CPUTYPE=e500 KERNCONF=MPC85XX

Approved by: cognet (mentor)
MFp4: e500


176771 03-Mar-2008 raj

Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family.

The PQ3 is a high performance integrated communications processing system
based on the e500 core, which is an embedded RISC processor that implements
the 32-bit Book E definition of the PowerPC architecture. For details refer
to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E

This port was tested and successfully run on the following members of the PQ3
family: MPC8533, MPC8541, MPC8548, MPC8555.

The following major integrated peripherals are supported:

* On-chip peripherals bus
* OpenPIC interrupt controller
* UART
* Ethernet (TSEC)
* Host/PCI bridge
* QUICC engine (SCC functionality)

This commit brings the main functionality and will be followed by individual
drivers that are logically separate from this base.

Approved by: cognet (mentor)
Obtained from: Juniper, Semihalf
MFp4: e500


176770 03-Mar-2008 raj

Rework and extend PowerPC headers definitons towards Book-E/e500 CPUs support.

Approved by: cognet (mentor)
Obtained from: Juniper, Semihalf
MFp4: e500


176742 02-Mar-2008 raj

Unify and generalize PowerPC headers, adjust AIM code accordingly.

Rework of this area is a pre-requirement for importing e500 support (and
other PowerPC core variations in the future). Mainly the following
headers are refactored so that we can cover for low-level differences between
various machines within PowerPC architecture:

<machine/pcpu.h>
<machine/pcb.h>
<machine/kdb.h>
<machine/hid.h>
<machine/frame.h>

Areas which use the above are adjusted and cleaned up.

Credits for this rework go to marcel@

Approved by: cognet (mentor)
MFp4: e500


176734 02-Mar-2008 jeff

- Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
properties.
- Provide several convenience functions for creating one and two level
cpu trees as well as a default flat topology. The system now always
has some topology.
- On i386 and amd64 create a seperate level in the hierarchy for HTT
and multi-core cpus. This will allow the scheduler to intelligently
load balance non-uniform cores. Presently we don't detect what level
of the cache hierarchy is shared at each level in the topology.
- Add a mechanism for testing common topologies that have more information
than the MD code is able to provide via the kern.smp.topology tunable.
This should be considered a debugging tool only and not a stable api.

Sponsored by: Nokia


176617 27-Feb-2008 marcel

Avoid hardcoding the kernel link address in the linker script.
Use KERNBASE instead. While here, move the text sections
forward to the beginning of the text segment.


176534 25-Feb-2008 raj

Teach PowerPC CPU identification routines to recognize e500 cores. Fix style
issues in this area.

Approved by: cognet (mentor)
MFp4: e500


176530 24-Feb-2008 raj

Let PowerPC world optionally build with -msoft-float. For FPU-less PowerPC
variations (e500 currently), this provides a gcc-level FPU emulation and is an
alternative approach to the recently introduced kernel-level emulation
(FPU_EMU).

Approved by: cognet (mentor)
MFp4: e500


176524 24-Feb-2008 marcel

Don't define DEBUG. No debugging required.

Pointy hat: marcel


176501 24-Feb-2008 marcel

Resolve warnings exposed by LINT.
o Put prototypes in a single header only.
o Fix printf format specifiers.


176495 23-Feb-2008 marcel

Add FPU_EMU.


176491 23-Feb-2008 marcel

Add a floating-point emulator so that a single userland or single ABI
can run on processors that don't have a FPU. This is typically the
case for Book E processors. While a tuned system will probably want
to use soft-float (or use a processor that has a FPU if the usage is
FP intensive enough), allowing hard-float on FPU-less systems gives
great portability and flexibility.

Obtained from: NetBSD


176483 23-Feb-2008 marcel

Define the bootinfo structure for FreeBSD. It is not used on
AIM, but it's used for BookE.


176345 16-Feb-2008 marcel

Enable option WITNESS_SKIPSPIN by default.


176222 12-Feb-2008 marcel

Remove SMP left-overs from NetBSD.


176214 12-Feb-2008 marcel

There's no need to suppress option GDB.


176208 12-Feb-2008 marcel

Add PIC support for IPIs. When registering an interrupt handler,
the PIC also informs the platform at which IRQ level it can start
assigning IPIs, since this can depend on the number of IRQs
supported for external interrupts.


175668 26-Jan-2008 julian

One of my powerbooks has this chip in it..
Confirmed by looking at netbsd.. they have also added this.
checked by grehen
MFC After: 3 days


175147 07-Jan-2008 jhb

Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6.


175067 03-Jan-2008 alc

Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.


174938 27-Dec-2007 alc

Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.


174898 25-Dec-2007 rwatson

Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument. This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.


174822 21-Dec-2007 marcel

Apply missing s/rv/res/g in previous commit.


174820 20-Dec-2007 jhb

MFamd64/ia64/i386: Only set the rman bus tags and handles in
bus_activate_resource() methods instead of splitting it up between
bus_alloc_resource() and bus_activate_resource().

Glanced at by: marcel


174782 19-Dec-2007 marcel

Redefine bus_space_tag_t on PowerPC from a 32-bit integral to
a pointer to struct bus_space. The structure contains function
pointers that do the actual bus space access.

The reason for this change is that previously all bus space
accesses were little endian (i.e. had an explicit byte-swap
for multi-byte accesses), because all busses on Macs are little
endian.
The upcoming support for Book E, and in particular the E500
core, requires support for big-endian busses because all
embedded peripherals are in the native byte-order.

With this change, there's no distinction between I/O port
space and memory mapped I/O. PowerPC doesn't have I/O port
space. Busses assign tags based on the byte-order only.
For that purpose, two global structures exist (bs_be_tag and
bs_le_tag), of which the address can be taken to get a valid
tag.

Obtained from: Juniper, Semihalf


174632 16-Dec-2007 marcel

Rename OEA to AIM. The former means nothing as it applies to all
processors (it's the PowerPC Operating Environment Architecture).
AIM designates the processors made by the Apple-IBM-Motorola
alliance and those we typically support.

While here, remove the NetBSD option IPKDB. It's not an option
used by us. Also, PPC_HAVE_FPU is not used by us either. Remove
that too.

Obtained from: Juniper, Semihalf


174601 14-Dec-2007 marcel

This file was repocopied to src/sys/powerpc/aim, where it will
live on -- an afterlife.


174599 14-Dec-2007 marcel

Forced commit to record that this file was repocopied from
src/sys/powerpc/powerpc and modified for its new location.


174593 14-Dec-2007 marcel

Remove unused file.


174405 07-Dec-2007 jkoshy

Add stubs to unbreak LINT.


174195 02-Dec-2007 rwatson

Break out stack(9) from ddb(4):

- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
defined, or also if "options DDB" is defined to provide compatibility
with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to. It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested: amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested: amd64 (rwatson), arm (cognet), i386 (rwatson)


173970 27-Nov-2007 jasone

Define atomic_readandclear_ptr.


173928 26-Nov-2007 jb

Implement the _long functions using u_long rather than trying to
cast as uint32_t which is defined as unsigned int. gcc doesn't want to
consider that there might not be much difference between an int and
a long on a 32 bit architecture.


173799 21-Nov-2007 scottl

Extend critical section coverage in the low-level interrupt handlers to
include the ithread scheduling step. Without this, a preemption might
occur in between the interrupt getting masked and the ithread getting
scheduled. Since the interrupt handler runs in the context of curthread,
the scheudler might see it as having a such a low priority on a busy system
that it doesn't get to run for a _long_ time, leaving the interrupt stranded
in a disabled state. The only way that the preemption can happen is by
a fast/filter handler triggering a schduling event earlier in the handler,
so this problem can only happen for cases where an interrupt is being
shared by both a fast/filter handler and an ithread handler. Unfortunately,
it seems to be common for this sharing to happen with network and USB
devices, for example. This fixes many of the mysterious TCP session
timeouts and NIC watchdogs that were being reported. Many thanks to Sam
Lefler for getting to the bottom of this problem.

Reviewed by: jhb, jeff, silby


173742 19-Nov-2007 jb

Define atomic_cmpset_acq_long and atomic_cmpset_rel_long so that
they use casts rather than just assuming that the compiler will DTRT
without complaining.


173708 17-Nov-2007 alc

Prevent the leakage of wired pages in the following circumstances:
First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated.
Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the
pages beyond the EOF are unmapped and freed. However, when the file is
mlock(2)ed, the pages beyond the EOF are unmapped but not freed because
they have a non-zero wire count. This can be a mistake. Specifically,
it is a mistake if the sole reason why the pages are wired is because of
wired, managed mappings. Previously, unmapping the pages destroys these
wired, managed mappings, but does not reduce the pages' wire count.
Consequently, when the file is unmapped, the pages are not unwired
because the wired mapping has been destroyed. Moreover, when the vm
object is finally destroyed, the pages are leaked because they are still
wired. The fix is to reduce the pages' wired count by the number of
wired, managed mappings destroyed. To do this, I introduce a new pmap
function pmap_page_wired_mappings() that returns the number of managed
mappings to the given physical page that are wired, and I use this
function in vm_object_page_remove().

Reviewed by: tegge
MFC after: 6 weeks


173615 14-Nov-2007 marcel

o Rename cpu_thread_setup() to cpu_thread_alloc() to better
communicate that it relates to (is called by) thread_alloc()
o Add cpu_thread_free() which is called from thread_free()
to counter-act cpu_thread_alloc().

i386: Have cpu_thread_free() call cpu_thread_clean() to
preserve behaviour.
ia64: Have cpu_thread_free() call mtx_destroy() for the
mutex initialized in cpu_thread_alloc().

PR: ia64/118024


173601 14-Nov-2007 julian

A bunch more files that should probably print out a thread name
instead of a process name.


173600 14-Nov-2007 julian

generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.


173584 13-Nov-2007 grehan

Split decr_init() into two, with the section that reads the timebase
frequency from OpenFirmware moved out and into a routine that is called
from cpu_startup().

This allows correct reporting of the CPU clockspeed when printing out
CPU information at boot time.

Reported by: numerous
Reviewed by: marcel
MFC after: 1 day


173361 05-Nov-2007 kib

Fix for the panic("vm_thread_new: kstack allocation failed") and
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.

As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.

The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).

In collaboration with: Peter Holm
Reviewed by: jhb


172887 23-Oct-2007 grehan

Cut over to ULE on PowerPC

kern/sched_ule.c - Add __powerpc__ to the list of supported architectures

powerpc/conf/GENERIC - Swap SCHED_4BSD with SCHED_ULE

powerpc/powerpc/genassym.c - Export TD_LOCK field of thread struct

powerpc/powerpc/swtch.S - Handle new 3rd parameter to cpu_switch() by
updating the old thread's lock. Note: uniprocessor-only, will require
modification for MP support.

powerpc/powerpc/vm_machdep.c - Set 3rd param of cpu_switch to mutex of
old thread's lock, making the call a no-op.

Reviewed by: marcel, jeffr (slightly older version)


172394 30-Sep-2007 marius

Make the PCI code aware of PCI domains (aka PCI segments) so we can
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.

Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)


172334 26-Sep-2007 marius

o Revert the part of if_gem.c rev. 1.35 which added a call to gem_stop()
to gem_attach() as the former access softc members not yet initialized
at that time and gem_reset() actually is enough to stop the chip. [1]
o Revise the use of gem_bitwait(); add bus_barrier() calls before calling
gem_bitwait() to ensure the respective bit has been written before we
starting polling on it and poll for the right bits to change, f.e. even
though we only reset RX we have to actually wait for both GEM_RESET_RX
and GEM_RESET_TX to clear. Add some additional gem_bitwait() calls in
places we've been missing them according to the GEM documentation.
Along with this some excessive DELAYs, which probably only were added
because of bugs in gem_bitwait() and its use in the first place, as
well as as have of an gem_bitwait() reimplementation in gem_reset_tx()
were removed.
o Add gem_reset_rxdma() and use it to deal with GEM_MAC_RX_OVERFLOW errors
more gracefully as unlike gem_init_locked() it resets the RX DMA engine
only, causing no link loss and the FIFOs not to be cleared. Also use it
deal with GEM_INTR_RX_TAG_ERR errors, with previously were unhandled.
This was based on information obtained from the Linux GEM and OpenSolaris
ERI drivers.
o Turn on workarounds for silicon bugs in the Apple GMAC variants.
This was based on information obtained from the Darwin GMAC and Linux GEM
drivers.
o Turn on "infinite" (i.e. maximum 31 * 64 bytes in length) DMA bursts.
This greatly improves especially RX performance.
o Optimize the RX path, this consists of:
- kicking the receiver as soon as we've a spare descriptor in gem_rint()
again instead of just once after all the ready ones have been handled;
- kicking the receiver the right way, i.e. as outlined in the GEM
documentation in batches of 4 and by pointing it to the descriptor
after the last valid one;
- calling gem_rint() before gem_tint() in gem_intr() as gem_tint() may
take quite a while;
- doubling the size of the RX ring to 256 descriptors.
Overall the RX performance of a GEM in a 1GHz Sun Fire V210 was improved
from ~100Mbit/s to ~850Mbit/s.
o In gem_add_rxbuf() don't assign the newly allocated mbuf to rxs_mbuf
before calling bus_dmamap_load_mbuf_sg(), if bus_dmamap_load_mbuf_sg()
fails we'll free the newly allocated mbuf, unable to recycle the
previous one but a NULL pointer dereference instead.
o In gem_init_locked() honor the return value of gem_meminit().
o Simplify gem_ringsize() and dont' return garbage in the default case.
Based on OpenBSD.
o Don't turn on MAC control, MIF and PCS interrupts unless GEM_DEBUG is
defined as we don't need/use these interrupts for operation.
o In gem_start_locked() sync the DMA maps of the descriptor rings before
every kick of the transmitter and not just once after enqueuing all
packets as the NIC might instantly start transmitting after we kicked
it the first time.
o Keep state of the link state and use it to enable or disable the MAC
in gem_mii_statchg() accordingly as well as to return early from
gem_start_locked() in case the link is down. [3]
o Initialize the maximum frame size to a sane value.
o In gem_mii_statchg() enable carrier extension if appropriate.
o Increment if_ierrors in case of an GEM_MAC_RX_OVERFLOW error and in
gem_eint(). [3]
o Handle IFF_ALLMULTI correctly; don't set it if we've turned promiscuous
group mode on and don't clear the flag if we've disabled promiscuous
group mode (these were mostly NOPs though). [2]
o Let gem_eint() also report GEM_INTR_PERR errors.
o Move setting sc_variant from gem_pci_probe() to gem_pci_attach() as
device probe methods are not supposed to touch the softc.
o Collapse sc_inited and sc_pci into bits for sc_flags.
o Add CTASSERTs ensuring that GEM_NRXDESC and GEM_NTXDESC are set to
legal values.
o Correctly set up for 802.3x flow control, though #ifdef out the code
that actually enables it as this needs more testing and mainly a proper
framework to support it.
o Correct and add some conversions from hard-coded functions names to
__func__ which were borked or forgotten in if_gem.c rev. 1.42.
o Use PCIR_BAR instead of a homegrown macro.
o Replace sc_enaddr[6] with sc_enaddr[ETHER_ADDR_LEN].
o In gem_pci_attach() in case attaching fails release the resources in
the opposite order they were allocated.
o Make gem_reset() static to if_gem.c as it's not needed outside that
module.
o Remove the GEM_GIGABIT flag and the associated code; GEM_GIGABIT was
never set and the associated code was in the wrong place.
o Remove sc_mif_config; it was only used to cache the contents of the
respective register within gem_attach().
o Remove the #ifdef'ed out NetBSD/OpenBSD code for establishing a suspend
hook as it will never be used on FreeBSD.
o Also probe Apple Intrepid 2 GMAC and Apple Shasta GMAC, add support for
Apple K2 GMAC. Based on OpenBSD.
o Add support for Sun GBE/P cards, or in other words actually add support
for cards based on GEM to gem(4). This mainly consists of adding support
for the TBI of these chips. Along with this the PHY selection code was
rewritten to hardcode the PHY number for certain configurations as for
example the PHY of the on-board ERI of Blade 1000 shows up twice causing
no link as the second incarnation is isolated.
These changes were ported from OpenBSD with some additional improvements
and modulo some bugs.
o Add code to if_gem_pci.c allowing to read the MAC-address from the VPD on
systems without Open Firmware.
This is an improved version of my variant of the respective code in
if_hme_pci.c
o Now that gem(4) is MI enable it for all archs.

Pointed out by: yongari [1]
Suggested by: rwatson [2], yongari [3]
Tested on: i386 (GEM), powerpc (GMACs by marcel and yongari),
sparc64 (ERI and GEM)
Reviewed by: yongari
Approved by: re (kensmith)


172332 26-Sep-2007 brueffer

Use the correct expanded name for SCTP.

PR: 116496
Submitted by: koitsu
Reviewed by: rrs
Approved by: re (kensmith)


172317 25-Sep-2007 alc

Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq. Instead, they are kept in a separate per-object
splay tree of cached pages. However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock. Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held. Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case. Cached pages
are reclaimed far, far more often than they are reactivated. Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated. Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page. Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)


172189 15-Sep-2007 alc

It has been observed on the mailing lists that the different categories
of pages don't sum to anywhere near the total number of pages on amd64.
This is for the most part because uma_small_alloc() pages have never been
counted as wired pages, like their kmem_malloc() brethren. They should
be. This changes fixes that.

It is no longer necessary for the page queues lock to be held to free
pages allocated by uma_small_alloc(). I removed the acquisition and
release of the page queues lock from uma_small_free() on amd64 and ia64
weeks ago. This patch updates the other architectures that have
uma_small_alloc() and uma_small_free().

Approved by: re (kensmith)


171805 11-Aug-2007 marcel

Revamp the interrupt handling in support of INTR_FILTER. This includes:
o Revamp the PIC I/F to only abstract the PIC hardware. The
resource handling has been moved to nexus, where it belongs.
o Include EOI and MASK+EOI methods to the PIC I/F in support of
INTR_FILTER.
o With the allocation of interrupt resources and setup of
interrupt handlers in the common platform code we can delay
talking to the PIC hardware after enumeration of all devices.
Introduce a call to powerpc_intr_enable() in configure_final()
to achieve that and have powerpc_setup_intr() only program the
PIC when !cold.
o As a consequence of the above, remove all early_attach() glue
from the OpenPIC and Heathrow PIC drivers and have them
register themselves when they're found during enumeration.
o Decouple the interrupt vector from the interrupt request line.
Allocate vectors increasingly so that they can be used for
the intrcnt index as well. Extend the Heathrow PIC driver to
translate between IRQ and vector. The OpenPIC driver already
has the support for vectors in hardware.

Approved by: re (blanket)


171787 08-Aug-2007 marcel

Re-enable external interrupts for faults, traps and syscalls.

Approved by: re (blanket)


171785 07-Aug-2007 marcel

Eliminate <machine/interruptvar.h> as it has only a single
prototype. In the future that prototype will not be needed
at all anyway, but for now it's moved to intr_machdep.h.

Approved by: re (blanket)


171783 07-Aug-2007 marcel

Remove redundant prototype.

Approved by: re (blanket)


171782 07-Aug-2007 marcel

Add prototype for trap().

Approved by: re (blanket)


171670 31-Jul-2007 marcel

Fix backward compatibility of the "old" (i.e. FreeBSD6) lseek
syscall. It was broken when a new lseek syscall was introduced.
The problem is that we need to swap the 32-bit td_retval values
for the __syscall indirect syscall when the actual syscall has
a 32-bit return value. Hence, we need to exclude lseek(2). And
this means the "old" lseek(2) as well -- which we didn't.

Based on a patch from: grehan@
Approved by: re (rwatson)


171334 10-Jul-2007 marcel

Cast the arguments to atomic_*_ptr() when mapping it to atomic_*_32()
This is a minimal fix.

Approved by: re (kensmith)


170979 22-Jun-2007 yongari

Reimplement bus_dmamap_load with bus_dmamap_load_buffer.
Previously it didn't honor parent dma tag's restrictions such that
an invalid dma segment could be passed to device. The driver for the
device may panic in sanity check routine for the dma segment or may
produce unexpected results. I have no idea how it could ever have
worked before.

Reviewed by: grehan
Tested by: gad
Approved by: re (hrs)


170978 22-Jun-2007 yongari

Honor maxsegsz of less than a page size in a DMA tag. Previously it
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.

Reviewed by: scottl (long a ago)
Approved by: re (hrs)


170816 16-Jun-2007 alc

Enable the new physical memory allocator.

This allocator uses a binary buddy system with a twist. First and
foremost, this allocator is required to support the implementation of
superpages. As a side effect, it enables a more robust implementation
of contigmalloc(9). Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).

The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages. Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space. The performance benefits vary. In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.

This allocator does not implement page coloring. The reason is that
superpages have much the same effect. The contiguous physical memory
allocation necessary for a superpage is inherently colored.

Finally, the one caveat is that this allocator does not effectively
support prezeroed pages. I hope this is temporary. On i386, this is
a slight pessimization. However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects. I speculate
that this is true in general of machines with a direct map.

Approved by: re


170731 14-Jun-2007 delphij

Enable SCTP by default for GENERIC kernels in order to give it
more exposure. The current state of SCTP implementation is
considered to be ready for 32-bit platforms, but still need some
work/testing on 64-bit platforms.

Approved by: re (kensmith)
Discussed with: rrs


170652 13-Jun-2007 marcel

Enable GEOM_PART_MBR by default. On ia64 this replaces GEOM_MBR.


170473 09-Jun-2007 marcel

Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.


170440 08-Jun-2007 rwatson

Enable AUDIT by default in the GENERIC kernel, allowing security event
auditing to be turned on without a kernel recompile, just an rc.conf
option.

Approved by: re (kensmith)
Obtained from: TrustedBSD Project


170421 08-Jun-2007 marcel

Sync with other platforms: add kluge to use contigmalloc when the
alignment is larger than the size and print a diagnostic when we
didn't satisfy the alignment.


170363 06-Jun-2007 grehan

Fix the compile. Band-aid until it is worked out how to use the context
switch api on ppc.


170305 04-Jun-2007 jeff

- Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


170291 04-Jun-2007 attilio

Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)


170170 31-May-2007 attilio

Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)


170162 31-May-2007 piso

In some particular cases (like in pccard and pccbb), the real device
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.

Discussed with: jhb


170072 28-May-2007 alc

Eliminate some unused definitions that came from NetBSD.


170036 27-May-2007 marcel

Don't initialize the decrementer before initclocks() is called.
Use cpu_initclocks() for that as it assures that relevant locks
have been initialized.


170033 27-May-2007 alc

Eliminate an unused definition.


169846 22-May-2007 kan

Allow FreeBSD's native ELF image activators to execute shared libraries the
same way it was enabled for Linux binares in linuxulator.

This allows binaries built with -pie. Many ports auto-detect -fPIE support
in GCC 4.2 and build binaries FreeBSD was unable to run.


169667 18-May-2007 jeff

- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts. This can be used to abstract away pcpu details but also changes
to use atomics for all counters now. This means sched lock is no longer
responsible for protecting counts in the switch routines.

Contributed by: Attilio Rao <attilio@FreeBSD.org>


169291 05-May-2007 alc

Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory. The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used. Previously,
only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR: 112194


168885 20-Apr-2007 grehan

Add ofw bus methods to the ppc nexus driver. This will be used in future
EFIKA platform support.

PR: 111522
Submitted by: Andrew Turner, andrew at fubar geek nz


168603 10-Apr-2007 pjd

Remove trailing '.' for consistency!


168594 10-Apr-2007 pjd

Add UFS_GJOURNAL options to the GENERIC kernel.

Approved by: re (kensmith)


168205 01-Apr-2007 rwatson

If nooption SMP on powerpc, also nooption ADAPTIVE_SX, which depends on
SMP and is now in the global NOTES.


168201 01-Apr-2007 marcel

Add bge(4).
Fix a white-space nit while I'm here.


168200 01-Apr-2007 marcel

When writing to PCI configuration registers, don't immediately
read the same register back. It can cause hangs or machine
checks in certain cases. One particular case is with bge(4)
when a reset is initiated for the controller.

MFC after: 1 month


168193 01-Apr-2007 marcel

Remove unused file.


167429 11-Mar-2007 alc

Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb


167352 09-Mar-2007 mohans

Over NFS, an open() call could result in multiple over-the-wire
GETATTRs being generated - one from lookup()/namei() and the other
from nfs_open() (for cto consistency). This change eliminates the
GETATTR in nfs_open() if an otw GETATTR was done from the namei()
path. Instead of extending the vop interface, we timestamp each attr
load, and use this to detect whether a GETATTR was done from namei()
for this syscall. Introduces a thread-local variable that counts the
syscalls made by the thread and uses <pid, tid, thread syscalls> as
the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on
thread state that could be used as the timestamp with minimal overhead.


167289 07-Mar-2007 piso

Update openpic to support the new bus_setup_intr() syntax.

Reviewed by: marcel


167170 02-Mar-2007 piso

Make pswitch_intr() returns interrupt handling status.


166978 25-Feb-2007 piso

Catch up with bus_setup_intr() modification and garbage collect a
reference to INTR_FAST.


166901 23-Feb-2007 piso

o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@


166812 18-Feb-2007 marcel

The table of known CPU models ends with an entry that has a version
of 0, not with an entry that has an empty CPU name.

Submitted by: Andrew Turner (andrew@fubar.geek.nz)


166659 12-Feb-2007 kevlo

Remove the cast to caddr_t for sfp, they're not needed.

Reviewed by: marcel


166604 09-Feb-2007 brooks

Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by: pjd


166551 07-Feb-2007 marcel

Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.


166249 26-Jan-2007 marcel

Remove stale header.

MFC after: 3 days


166011 14-Jan-2007 marcel

Propagate the CPU model to the hw.model sysctl.


165967 12-Jan-2007 imp

Remove 3rd clause, renumber, ok per email


165926 10-Jan-2007 marius

Add missing SC_NO_MODE_CHANGE option. Disable it in the powerpc
NOTES though, as ofw_syscons(4) doesn't properly interface with
syscons(4) regarding loading the font specified with SC_DFLT_FONT,
causing a kernel with both options SC_OFWFB and SC_NO_MODE_CHANGE
to not link.


165608 28-Dec-2006 marcel

In cpu_reset(), call OF_reboot() instead of OF_exit(). The latter
doesn't do a reboot and has been observed to reset the NVRAM to its
default values.


165362 20-Dec-2006 grehan

Remove bogus increment of re-hashed PTEG index. This snuck in with r1.12 of
pmap.c, and is potentially the cause of hangs reported on machines with a
small amount of memory. On machines with sufficient RAM, and without a lot
of processes running, this situation would probably never occur.

Testing is still incomplete, but it is obviously wrong so remove the
offending code now.

The issue of what to do when both the primary and secondary hash overflow
is still open.

Reported by: Dan Kresja at windriver dot com, via alc


165151 13-Dec-2006 marcel

Implement OF_decode_addr(). This makes uart(4) work as a serial
console on a Xserve G4.


165147 13-Dec-2006 marcel

Implement bus_space_map().


164987 07-Dec-2006 marcel

Add support for multiple FAST handlers.

Reviewed & tested by: grehan@


164936 06-Dec-2006 julian

Threading cleanup.. part 2 of several.

Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs. Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.


164895 05-Dec-2006 grehan

Fix gdb issue where the i-cache was not being updated when a breakpoint
was written into a user's address space. The fix is to modify uiomove_fromphys
to sync the icache when an executable user-space page is written into.

Alan Cox suggested that there should probably be a higher-level interface
to this in the ptrace code, but agreed that this is an OK short-term solution.

Files changed:

pmap.h - declaration of pmap_page_executable()
pmap_dispatch.c - pass through the page_executable call to the mmu object
mmu_oea.c - implement the page_executable method by examining the PTE_EXEC
field in the vm_page_t
uio_machdep.c - in uiomove_fromphys(), if the op was a UIO_WRITE to user-space,
and if the page is executable, sync the icache since this is at the least
a breakpoint-write from gdb.

Reported by: marcel
Tested by: marcel, grehan on g3+g4
Discussed with: alc
MFC after: 2 weeks


164765 30-Nov-2006 grehan

Don't use vm_page_flag_set() if installing bootstrap page-table entries
since the vm page mutex's aren't yet initialized. Fixes boot-time panic.

Reported by: Dario Freni saturnero at freesbie dot org


164760 30-Nov-2006 jb

Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.


164229 12-Nov-2006 alc

Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller. (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)


164198 11-Nov-2006 alc

Eliminate unused global variables.


163987 04-Nov-2006 jb

Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.


163972 04-Nov-2006 jb

Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.


163858 01-Nov-2006 jb

Add a cnputs() function to write a string to the console with
a lock to prevent interspersed strings written from different CPUs
at the same time.

To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.

String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.

Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.

A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.

Reviewed by: scottl, jhb
MFC after: 2 weeks


163711 26-Oct-2006 jb

Remove the KSE option now that it's in DEFAULTS on these arches/machines.

The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.

Submitted by: scottl@


163710 26-Oct-2006 jb

Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.

This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.

Submitted by: scottl@


163709 26-Oct-2006 jb

Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by: davidxu@


163630 23-Oct-2006 ru

Move "device splash" back to MI NOTES and "files", it's MI.


163626 23-Oct-2006 ru

Mechanically kill redundant nodevice/nooption/nomakeoption, i.e.,
those that do not exist in MI NOTES or switched on/off in the MD
NOTES.


163488 18-Oct-2006 grehan

Fix remaining compile error.


163472 18-Oct-2006 davidxu

Attempt to fix compiling problem.

Noticed by: tinderbox


163449 17-Oct-2006 davidxu

o Add keyword volatile for user mutex owner field.
o Fix type consistent problem by using type long for old
umtx and wait channel.
o Rename casuptr to casuword.


163192 10-Oct-2006 bde

The powerpc and sparc64 MD `reboot' commands should never have existed
since they just duplicated the MI `reset' command. Instead of removing
them, make `reboot' an MI alias for `reboot' since this gives a better
way of killing the `r' alias for `reset'. Remove the `registers' command
that was used to kill the alias.

Turn the powerpc and sparc64 MD `halt' command into an MI command.

A copy of sparc64/db_interface.c grew in sun4v just after I found the
extra reboot commands. It has not been changed, and is now not
identical. Duplicated commands come out duplicated in ddb's online
help, but cause large problems when used (e.g., on i386's with 2 halt's
and an hwatch, typing h doesn' give the expected message about an
ambiguous command, but hangs like the halt command or a looping parseri
would).


163041 05-Oct-2006 simon

- Remove SCHED_ULE from GENERIC to better avoid foot-shooting by
unsuspecting users.
- Add a comment in NOTES about experimental status of SCHED_ULE.
- Make warning about experimental status in sched_ule(4) a bit
stronger.

Suggested and reviewed by: dougb
Discussed on: developers
MFC after: 3 days


163021 05-Oct-2006 grehan

Catch up with recent clock modifications:
- include <sys/clock.h> for inittodr prototype
- remove now-conflicting SECDAY definition that is in <sys/clock.h>


163016 04-Oct-2006 jb

PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.


162961 02-Oct-2006 phk

remove orphaned sysctl_machdep_adjkerntz()


162958 02-Oct-2006 phk

Second part of a little cleanup in the calendar/timezone/RTC handling.

Split subr_clock.c in two parts (by repo-copy):
subr_clock.c contains generic RTC and calendaric stuff. etc.
subr_rtc.c contains the newbus'ified RTC interface.

Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c. They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.


162954 02-Oct-2006 phk

First part of a little cleanup in the calendar/timezone/RTC handling.

Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.


162658 26-Sep-2006 ru

Added COMPAT_FREEBSD6 option.


162487 21-Sep-2006 kan

Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and __builtin_va_start was present in all GCC version 3.1 and
later.


162361 16-Sep-2006 rwatson

Add audit hooks for ppc, ia64 system call paths.

Reviewed by: marcel (ia64)
Obtained from: TrustedBSD Project
MFC after: 3 days


161797 01-Sep-2006 marcel

In cpu_set_user_tls(), properly set the thread pointer. It is 0x7000
bytes after the end of the TCB, which is itself 8 bytes.


161675 28-Aug-2006 davidxu

Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.


161628 25-Aug-2006 alc

Eliminate unused definitions. (They came from NetBSD.)

Discussed with: cognet, grehan, marcel


161588 24-Aug-2006 marcel

Add skeletal support for GDB. In particular gdb_cpu_getreg() needs
implementing to make GDB support usable.


160959 03-Aug-2006 sobomax

Use proper trap code for the EXC_ALI traps. This fixes SIGBUS during
unaligned 64-bits load/stores.

MFC after: 2 weeks


160932 02-Aug-2006 jhb

Don't ignore errors from intr_event_add_handler().

CID: 1516
Found by: Coverity Prevent (tm)


160892 01-Aug-2006 sobomax

Add device to access and modify Open Firmware NVRAM settings in
PowerPC-based Apple's machines and small utility to do it from
userland modelled after the similar utility in Darwin/OSX.

Only tested on 1.25GHz G4 Mac Mini.

MFC after: 1 month


160889 01-Aug-2006 alc

Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write(). However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)


160813 29-Jul-2006 marcel

Remove sio(4) and related options from MI files to amd64, i386
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.


160801 28-Jul-2006 jhb

Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.


160798 28-Jul-2006 jhb

Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.


160773 27-Jul-2006 jhb

Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
section.
- Add a new explicit check before userret() to make sure we don't return
with any locks held. The advantage here is that we can include the
syscall number and name in syscall() whereas that info is not available
in userret().
- Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by
the more general checks just added.

MFC after: 2 weeks


160764 27-Jul-2006 jhb

Add missing ptrace(2) system-call stops to various syscall()
implementations.

MFC after: 1 week


160722 26-Jul-2006 marcel

Turn this into an uart(4) bus attachment.


160721 26-Jul-2006 marcel

Repocopy from: src/sys/powerpc/psim/sio_iobus.c
to: src/sys/powerpc/psim/uart_iobus.c

Meister: simon@


160719 26-Jul-2006 marcel

o Remove device zs
o Remove nodevice uart
o Reorder


160718 26-Jul-2006 marcel

o Enable -Werror
o Remove commented-out sio(4)
o Remove zs(4)
o Add scc(4)
o Add uart(4)


160714 26-Jul-2006 marcel

o Move the prototype of mem_valid() from ofw_machdep.h to md_var.h.
This avoids that mem.c has to include ofw_machdep.h, including
all OFW related headers.
o Provide a stub for OF_decode_addr(), which is used by low-level
console drivers to obtain a tag and handle given a OFW phandle.
This is different from sparc64, where a fake bus tag needs to be
created explicitly.


160713 26-Jul-2006 marcel

Include needed clock.h.


160712 26-Jul-2006 marcel

Forward declare struct trapframe.


160312 12-Jul-2006 jhb

Simplify the pager support in DDB. Allowing different db commands to
install custom pager functions didn't actually happen in practice (they
all just used the simple pager and passed in a local quit pointer). So,
just hardcode the simple pager as the only pager and make it set a global
db_pager_quit flag that db commands can check when the user hits 'q' (or a
suitable variant) at the pager prompt. Also, now that it's easy to do so,
enable paging by default for all ddb commands. Any command that wishes to
honor the quit flag can do so by checking db_pager_quit. Note that the
pager can also be effectively disabled by setting $lines to 0.

Other fixes:
- 'show idt' on i386 and pc98 now actually checks the quit flag and
terminates early.
- 'show intr' now actually checks the quit flag and terminates early.


160235 10-Jul-2006 alc

Add synchronization to moea_zero_page() and moea_zero_page_area().

Remove the acquisition and release of Giant from moea_zero_page_idle().

Tested by: grehan@


160068 01-Jul-2006 alc

Eliminate the acquisition and release of Giant from moea_extract_and_hold()
and moea_protect().

Tested by: grehan@ and rink@


159964 26-Jun-2006 babkin

Backed out the change by request from rwatson.

PR: kern/14584


159928 25-Jun-2006 alc

Synchronize accesses to the PTEG table.

Add many lock assertions.

Tested by: grehan@


159927 25-Jun-2006 babkin

The common UID/GID space implementation. It has been discussed on -arch
in 1999, and there are changes to the sysctl names compared to PR,
according to that discussion. The description is in sys/conf/NOTES.
Lines in the GENERIC files are added in commented-out form.
I'll attach the test script I've used to PR.

PR: kern/14584
Submitted by: babkin


159705 17-Jun-2006 rink

Prevent 'mutex not owned' panic on boot if INVARIANTS is in the kernel. This
makes the GENERIC kernel boot on ppc.

Reviewed by: grehan
Approved by: imp (mentor)
MFC after: 1 week

dCVS: ----------------------------------------------------------------------


159651 15-Jun-2006 netchild

Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's
an explicit comment that it's needed for the linuxolator. This is not the
case anymore. For all other architectures there was only a "KEEP THIS".
I'm (and other people too) running a COMPAT_43-less kernel since it's not
necessary anymore for the linuxolator. Roman is running such a kernel for a
for longer time. No problems so far. And I doubt other (newer than ia32
or alpha) architectures really depend on it.

This may result in a small performance increase for some workloads.

If the removal of COMPAT_43 results in a not working program, please
recompile it and all dependencies and try again before reporting a
problem.

The only place where COMPAT_43 is needed (as in: does not compile without
it) is in the (outdated/not usable since too old) svr4 code.

Note: this does not remove the COMPAT_43TTY option.

Nagging by: rdivacky


159627 15-Jun-2006 ups

Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps


159547 12-Jun-2006 marcel

Some machines have an ESCC. Make sure we build uart(4) with support
for the z8530.


159537 12-Jun-2006 imp

Add the ability to subset the devices that UART pulls in. This allows
the arm to compile without all the extras that don't appear, at least
not in the flavors of ARM I deal with. This helps us save about 100k.

If I've botched the available devices on a platform, please let me
know and I'll correct ASAP.


159324 06-Jun-2006 alc

Correct a typo in the previous revision.


159323 06-Jun-2006 alc

Add a stub for pmap_enter_object().


159303 05-Jun-2006 alc

Introduce the function pmap_enter_object(). It maps a sequence of resident
pages from the same object. Use it in vm_map_pmap_enter() to reduce the
locking overhead of premapping objects.

Reviewed by: tegge@


158651 16-May-2006 phk

Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.


158449 11-May-2006 phk

Remove straggling reference to CPU_ macros


158445 11-May-2006 phk

Clean out sysctl machdep.* related defines.

The cmos clock related stuff should really be in MI code.


157895 20-Apr-2006 imp

Set the rid for any resource obtained from rman_resource_reserve.


157583 07-Apr-2006 marcel

Add kbdmux(4). This avoids having to use the hint.pcib.1.skipslot=26
trick on a PowerBook G4 and friends to get the USB keyboard as ukbd0.


157443 03-Apr-2006 peter

Remove the unused sva and eva arguments from pmap_remove_pages().


157316 31-Mar-2006 marcel

Add a dummy implementation of bus_space_map().


155455 08-Feb-2006 phk

Simplify system time accounting for profiling.

Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly. Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret(). Use the absolute value of
td->pticks in userret() and eliminate third argument.


154170 10-Jan-2006 phk

Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.


154091 07-Jan-2006 grehan

Set the siginfo si_addr field, and also the mysterious 3rd parameter
to old-style signals, to be the DAR register for DSI miss exceptions.
This gives the address of the access rather than the instruction
address. The behaviour is now the same as on i386.

Found by: libsigsegv tests


154079 06-Jan-2006 jhb

- Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all the
various pcib drivers to use their own private devclass_t variables for
their modules.
- Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib
drivers while I'm here.


153940 31-Dec-2005 netchild

MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
this allows to try different coloring algorithms without the need to
touch every file [1]
- make the page queue tuning values readable: sysctl vm.stats.pagequeue
- autotuning of the page coloring values based upon the cache size instead
of options in the kernel config (disabling of the page coloring as a
kernel option is still possible)

MD changes:
- detection of the cache size: only IA32 and AMD64 (untested) contains
cache size detection code, every other arch just comes with a dummy
function (this results in the use of default values like it was the
case without the autotuning of the page coloring)
- print some more info on Intel CPU's (like we do on AMD and Transmeta
CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by: Chad David <davidc@acns.ab.ca> [1]
Reviewed by: alc, arch (in 2004)
Discussed with: alc, Chad David, arch (in 2004)


153890 30-Dec-2005 ru

Remove duplicate options (originals in sys/conf/NOTES).

Reported by: fresh config(8)


153813 29-Dec-2005 grehan

Add user-space profiling support. Kernel profiling still todo.

Obtained from: NetBSD


153741 26-Dec-2005 sobomax

Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR: kern/87615
Submitted by: Marcin Koziej <creep@desk.pl>


153701 24-Dec-2005 grehan

Forward-declare struct trapframe to allow the aic module to compile.


153685 23-Dec-2005 grehan

Mark the return address of the call to ast() in the generic trap
handling code so the stack trace unwinders don't start trying to
go into user-space.

Found by trying to create core dumps with a KTR_COMPILE/KTR_GEOM
kernel, which results in a stack_save() call in the ast() coredump
path - this created a panic, and then calling 'trace' in ddb resulted
in the black screen of death after printing out most of the backtrace.


153666 22-Dec-2005 jhb

Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly. In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures. Basically,
all the archs were taking a trapframe and converting it into a clockframe
one way or another. Now they can just extract the PC and usermode values
directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
timecounter, call hardclock() directly. This removes an extra
conditional check from every clock interrupt on Alpha on the BSP.
There is probably room for even further pruning here by changing Alpha
to use the simplified timecounter we use on x86 with the lapic timer
since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
is false, so add a KASSERT() to that affect and remove a condition
to slightly optimize the non-lapic case.
- Change prototypeof arm_handler_execute() so that it's first arg is a
trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on: alpha, amd64, arm, i386, ia64, sparc64
Reviewed by: bde (mostly)


153592 21-Dec-2005 sam

add LINT build

Discussed with: grehan
MFC after: 1 week


153488 16-Dec-2005 jhb

GC some unused frame types.

Approved by: grehan


153179 06-Dec-2005 jhb

- Cleanup whitespace and extra ()s in vtophys() macros.
- Move vtophys() macros next to vtopte() where vtopte() exists to match
comments above vtopte().
- Remove references to the alternate address space in the comment above
vtopte(). amd64 never had the alternate address space, and i386 lost it
prior to PAE support being added.
- s/entires/entries/ in comments.

Reviewed by: alc


153168 06-Dec-2005 ru

Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE). Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.


153050 03-Dec-2005 marius

Convert to use the recently introduced set of ofw_bus_gen_get_*() for
providing the ofw_bus KOBJ interface.

Tested by: grehan


152865 27-Nov-2005 ru

- Allow duplicate "machine" directives with the same arguments.
- Move existing "machine" directives to DEFAULTS.


152661 21-Nov-2005 jhb

Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move
'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and
amd64. Additionally, on ia64 enable ACPI by default since ia64 requires
acpi.


152646 21-Nov-2005 arun

Create a device node in /dev when a USB keyboard is plugged in.

Reviewed by: grehan


152630 20-Nov-2005 alc

Eliminate pmap_init2(). It's no longer used.


152310 11-Nov-2005 grehan

Add definitions for 64-bit PTEs


152305 11-Nov-2005 grehan

ata_generic_hw takes a dev as a parameter, not a channel.


152304 11-Nov-2005 grehan

Fix compile warning: pmap_bootstrap is now declared extern in pmap.h,
remove redundant declaration.


152233 09-Nov-2005 grehan

No longer needed: replaced by mmu_if.m/pmap_dispatch.c/mmu_oea.c


152228 09-Nov-2005 grehan

Apply r1.103 to correct place.

pmap.c on PowerPC is now a combo of mmu_if.m, pmap_dispatch.c
and mmu_oea.c

(I forgot to delete pmap.c from CVS in last jumbo commit)


152224 09-Nov-2005 alc

Reimplement the reclamation of PV entries. Specifically, perform
reclamation synchronously from get_pv_entry() instead of
asynchronously as part of the page daemon. Additionally, limit the
reclamation to inactive pages unless allocation from the PV entry zone
or reclamation from the inactive queue fails. Previously, reclamation
destroyed mappings to both inactive and active pages. get_pv_entry()
still, however, wakes up the page daemon when reclamation occurs. The
reason being that the page daemon may move some pages from the active
queue to the inactive queue, making some new pages available to future
reclamations.

Print the "reclaiming PV entries" message at most once per minute, but
don't stop printing it after the fifth time. This way, we do not give
the impression that the problem has gone away.

Reviewed by: tegge


152180 08-Nov-2005 grehan

Name change from pmap_* to moea_* to fit into the new order of
mmu implementation.

This code handles the 32-bit 'OEA' MMU found on G2/G3/G4 PPC cores.


152179 08-Nov-2005 grehan

Insert a layer of indirection to the pmap code, using a kobj for
the interface. This allows run-time selection of MMU code, based
on CPU-type detection, or tunable-overrides when testing new code.

Pre-requisite for G5 support.

conf/files.powerpc
- remove pmap.c
- add mmu_if.h, mmu_oea.c, pmap_dispatch.c

powerpc/include/mmuvar.h
- definitions for MMU implementations

powerpc/include/pmap.h
- remove pmap_pte_spill declaration
- add pmap_mmu_install declaration
- size the phys_avail array
- pmap_bootstrapped is now global-scope

powerpc/powerpc/machdep.c
- call kobj_machdep_init early in the boot sequence to allow
kobj usage prior to SI_SUB_LOCK
- install the OEA pmap code. This will be moved to CPU-specific
init code in the future.

powerpc/powerpc/mmu_if.m
- Kobj MMU interface definitions

powerpc/powerpc/pmap_dispatch.c
- central dispatch for pmap calls
- contains the global mmu kobj and the routine to locate the
the mmu implementation and init the kobj


152147 07-Nov-2005 grehan

Finally (!?) get to the bottom of the mysterious G3 boot-time panics.
After a number of tests using nop's to change the alignment, it was
confirmed that the mtibat instructions should be cache-aligned.
FreeScale app note AN2540 indicates that the isync before and after
the mtdbat is the right thing to do, but sync/isync isn't required
before the mtibat so it has been removed.

Fix by using a ".balign 32" to pull the code in question to the correct
alignment.

MFC after: 3 days


151891 30-Oct-2005 grehan

Copy SPRG0-3 registers at boot-time and restore when calling into
OpenFirmware. FreeBSD/ppc uses SPRG0 as the per-cpu data area pointer,
and SPRG1-3 as temporary registers during exception handling. There
have been a few instances where OpenFirmware does require these to
be part of it's context, such as cd-booting an eMac.

reported by: many
MFC after: 3 days


151877 30-Oct-2005 grehan

In stack_save, stop when a trap-frame is encountered. This prevents
trying to access user-space stack addresses when a user fault
is encountered, as occurs when GEOM KTR code is handling a page fault
and is using stack_save() to capture a trace for debug purposes.

It may be possible to walk beyond the trap-frame if it is a kernel fault,
as db_backtrace() does, but I don't think that complexity is needed in
this routine.

MFC after: 3 days


151658 25-Oct-2005 jhb

Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces. struct intr_event holds the list
of interrupt handlers associated with interrupt sources.
struct intr_thread contains the data relative to an interrupt thread.
Currently we still provide a 1:1 relationship of events to threads
with the exception that events only have an associated thread if there
is at least one threaded interrupt handler attached to the event. This
means that on x86 we no longer have 4 bazillion interrupt threads with
no handlers. It also means that interrupt events with only INTR_FAST
handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
intr_foo naming convention. This did require renaming the powerpc
MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
powerpc. This means that multiple INTR_FAST handlers can attach to the
same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
to the same interrupt. Sharing INTR_FAST handlers may not always be
desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
either. Drivers can always still use INTR_EXCL to ask for an interrupt
exclusively. The way this sharing works is that when an interrupt
comes in, all the INTR_FAST handlers are executed first, and if any
threaded handlers exist, the interrupt thread is scheduled afterwards.
This type of layout also makes it possible to investigate using interrupt
filters ala OS X where the filter determines whether or not its companion
threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
dumping their state. It also has a '/v' verbose switch which dumps
info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
handler is removed because it would suck for things like ppbus(8)'s
braindead behavior. The code is present, though, it is just under
#if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
event into a separate function so that ithread_loop() becomes more
readable. Previously this code was all in the middle of ithread_loop()
and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
curthread's proc against idlethread's proc. (Not really related to intr
changes)

Tested on: alpha, amd64, i386, sparc64
Tested on: arm, ia64 (older version of patch by cognet and marcel)


151316 14-Oct-2005 davidxu

1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64


150686 28-Sep-2005 marius

Add a font width argument to vi_load_font_t, vi_save_font_t and vi_putm_t
and do some preparations for handling 12x22 fonts (currently lots of code
implies and/or hardcodes a font width of 8 pixels). This will be required
on sparc64 which uses a default font size of 12x22 in order to add font
loading and saving support as well as to use a syscons(4)-supplied mouse
pointer image.
This API breakage is committed now so it can be MFC'ed in time for 6.0
and later on upcoming framebuffer drivers destined for use on sparc64
and which are expected to rely on using font loading internally and on
a syscons(4)-supplied mouse pointer image can be easily MFC'ed to
RELENG_6 rather than requiring a backport.

Tested on: i386, sparc64, make universe
MFC after: 1 week


150627 27-Sep-2005 jhb

Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on: i386, alpha, sparc64, arm (cognet)
Reviewed by: arch@
Submitted by: cognet (arm)
MFC after: 1 week


150270 18-Sep-2005 csjp

Introduce a kernel config for the Mandatory Access Control framework.
This kernel config briefly describes some of the major MAC policies
available on FreeBSD. The hope is that this will raise the awareness
about MAC and get more people interested.

Discussed with: scottl


150182 15-Sep-2005 jhb

Stop using the '+' constraint modifier with inline assembly. The '+'
constraint is actually only allowed for register operands. Instead, use
separate input and output memory constraints.

Education from: alc
Reviewed by: alc
Tested on: i386, alpha
MFC after: 1 week


149958 10-Sep-2005 grehan

Fix boot-time hang/panic on G3 systems when modifying IBAT0 in
pmap_bootstrap by using the sync;isync big hammer to make sure
all prior operations have completed.

Reported by: Nathan Whitehorn <nathan at uchicago edu>
MFC after: 2 days


149925 10-Sep-2005 marcel

Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint()
and db_md_list_watchpoints() to ddb/ddb.h.


149768 03-Sep-2005 alc

Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.


149337 20-Aug-2005 stefanf

Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ. Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with: bde


149116 16-Aug-2005 grehan

Remove unnecessary and alarming printf.

MFC after: 1 day


148666 03-Aug-2005 jeff

- Add support for saving stack traces and displaying them via printf(9)
and KTR.

Contributed by: Antoine Brodin <antoine.brodin@laposte.net>
Concept code from: Neal Fachan <neal@isilon.com>


148568 30-Jul-2005 grehan

Temporary band-aid to fix hang when a process exec's Altivec instructions.

trap_subr.S: declare a stub for the a-unavailable trap
that does an absolute jump to the vector-assist trap.
This is due to the fact that the vec-unavail trap
doesn't start at a 256-byte boundary, so the trick of
masking the bottom 8 bits of the link register to identify
the interrupt doesn't work, so let the vec-assist
case handle Altivec-disabled for the time being.

Note that this will be fixed in the future with a much
smaller vector code-stub (< 16 bytes) that will allow
use of strange vector offsets that are also present in
4xx processors, and also allow smaller differences in
vector codepaths on the G5.

trap.c: Treat altivec-unavailable/assist process traps as SIGILL.
Not quite correct, since altivec-assist should really be a panic,
but it is fine for the moment due to the above measure.

machdep.c Install the stub code for the altivec-unavailable trap, and
the standard trap code at the altivec-assist.

Reported by: Andreas Tobler <toa at pop agri ch>
MFC after: 3 days


148067 15-Jul-2005 jhb

Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables. This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after: 3 days
Tested on: i386, alpha, sparc64
Compiled on: ia64, powerpc, amd64
Kernel toolchain busted on: arm


147991 14-Jul-2005 kensmith

Add recently invented COMPAT_FREEBSD5 option.

MFC after: 3 days


147889 10-Jul-2005 davidxu

Validate if the value written into {FS,GS}.base is a canonical
address, writting non-canonical address can cause kernel a panic,
by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring
only canonical values get written to the registers.

Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com >
Approved by: re (scottl)


147851 09-Jul-2005 grehan

The nsegs parameter to bus_dmamap_load_mbuf_sg is not required to
be set to 0 on input. This caused a panic in an an MP test version
of the GEM driver from Marius, and from inspection of other PCI
drivers, the same problem would happen there.
Fix by explicitly setting to 0.

Approved by: re


147504 20-Jun-2005 obrien

Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from
questing kernel config files not in CVS.

Approved by: re(kensmith)


147298 11-Jun-2005 jkoshy

Unbreak the PowerPC GENERIC build.

Reviewed by: delphij


147217 10-Jun-2005 alc

Introduce a procedure, pmap_page_init(), that initializes the
vm_page's machine-dependent fields. Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.

Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.

Remove stale comments from pmap_init().

Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations. Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.

Tested by: cognet, kensmith, marcel


147191 09-Jun-2005 jkoshy

MFP4:

- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
PMC implementations across different architectures.
Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
in the future. Add more 'alias' names for commonly used events.

- bug fixes & documentation.


146794 29-May-2005 marcel

Create nexus in configure_first() instead of in configure(). This
makes sure that sysinit tasks that run after configure_first(),
but before configure() have a nexus to hang devices off.


146792 29-May-2005 marcel

Call cninit_finish() from configure_final().


146737 29-May-2005 grehan

The end values passed to rman_manage_region() for PCI i/o and mem
spaces were 1 too large. This resulted in the rman list not being
sorted correctly, and USB ports not being discovered on older
TiBooks.

Detective work by: Andreas Tobler <toa at pop dot agri dot ch>


146734 29-May-2005 nyan

Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64. The optimization is a trivial on recent machines.

Reviewed by: -arch (imp, marcel, dfr)


146462 21-May-2005 grehan

Quick hack-o-rama to allow the Xorg Radeon driver to start up. It
tries to mmap memory outside of the available BARs, so allow the
range checks to be relaxed with a sysctl.


146198 14-May-2005 grehan

Remove incorrect configuration setting that limited the Kauai ATA controller
to be master-only. The slave ATAPI drive on the Mac-Mini is now recognised.


145827 03-May-2005 grehan

- move to SCHED_4BSD per jeffr's comments on SCHED_ULE's state
- enable MSDOSFS
- ehci is stable on the powerbook
- modules have been working for a long time.


145772 01-May-2005 grehan

Catch up with latest ATA newbus commits.


145433 23-Apr-2005 davidxu

Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.


145343 20-Apr-2005 ps

Don't enter the debugger if KDB_UNATTENDED is set or if
debug.debugger_on_panic=0.

MFC after: 2 weeks


145332 20-Apr-2005 marcel

Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.


145311 20-Apr-2005 grehan

Get order right when initializing task file bus resources. ATA drives are
now recognised when booting from the drive, as opposed to net-booting which
the previous botched commit was tested with.


145253 18-Apr-2005 imp

Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb


145221 18-Apr-2005 grehan

Catch up with ATA mkIII definitions for registers that have different
functions for read vs. write.


144971 12-Apr-2005 jhb

Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.


144956 12-Apr-2005 ssouhlal

Unbreak the powerpc build by fixing some ATA constants that were renamed.

Approved by: grehan (mentor)


144807 08-Apr-2005 jhb

Change an instance of md_savecrit to md_saved_msr that I missed.


144637 04-Apr-2005 jhb

Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions. They no longer have any affect on
interrupts. This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit(). This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock. For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections. Note that I've also taken this
opportunity to push a few things into MD code rather than MI. For example,
critical_fork_exit() no longer exists. Instead, MD code ensures that new
threads have the correct state when they are created. Also, we no longer
try to fixup the idlethreads for APs in MI code. Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by: grehan, cognet, arch@, others
Tested on: i386, alpha, sparc64, powerpc, arm, possibly more


144457 01-Apr-2005 grehan

Introduce channel-level setmode newbus method.

Thanks to sos for the code re-org that allowed this.


144359 31-Mar-2005 grehan

Catch up with ATA-mkIII


143985 22-Mar-2005 sobomax

Add USB Communication Device Class Ethernet driver. Originally written for
FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported
to NetBSD and finally NetBSD version merged with original one goes into
FreeBSD.

Obtained from: http://www.gank.org/freebsd/cdce/
NetBSD
OpenBSD


143830 19-Mar-2005 grehan

Optimize putc routine to write 2 ints instead of 8 chars to uncached
framebuffer memory. Speeds up system time of large ascii file cat
by 4x.

Inspired by: Linux scrolling so damned fast.


143809 18-Mar-2005 murray

Add a comment to note that pseudo-device bpf is required for DHCP.
This is mentioned in the Handbook but it is not as obvious to new
users why bpf is needed compared to the other largely self-explanatory
items in GENERIC.

PR: conf/40855
MFC after: 1 week


143784 18-Mar-2005 grehan

Split configure into 3 steps ala sparc64

Obtained from: iedowse, sparc64


143634 15-Mar-2005 grehan

Prepend underscore to bus_dmamap_{unload|sync} in line with
recent busdma changes.


143633 15-Mar-2005 grehan

Include <sys/signalvar.h> for trapsignal prototype.


143632 15-Mar-2005 grehan

Long overdue sync-up with ATA code


143598 14-Mar-2005 scottl

Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch. This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date. Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API. sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.


143234 07-Mar-2005 grehan

Replaced previous hw.physmem extraction with des's mods to
getenv_ulong() - much simpler.

Pointed out by: des


143201 07-Mar-2005 grehan

physmem is a much better indicator for 'real' memory on PPC than Maxmem
since there are often significant holes in the memory map due to the
kernel, loader and OFW data structures not being included: Maxmem is
the highest available, so can be misleading.


143200 07-Mar-2005 grehan

Allow user to undersize memory with hw.physmem loader variable.

Obtained from: i386/machdep.c:getmemsize()


143063 02-Mar-2005 joerg

netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild. Extension to other compilers is supposed
to be possible, of course.

Submitted by: netchild
Reviewed by: various developers on arch@, some time ago


142882 01-Mar-2005 grehan

Catch up with "physical memory" sysctl change.
(MFi386: rev 1.608)


142771 28-Feb-2005 grehan

Catch the case where the idle loop is entered with interrupts disabled,
causing a hard hang.


142764 28-Feb-2005 grehan

- switch pcpu to a struct declaration ala amd64. It may be more efficient to
cache-align this struct, but that's a topic for a far-in-the-future
commit.
- eliminate commented-out reference to a non-existent pcpu field.


142756 28-Feb-2005 grehan

Correctly set kernelname for kern.bootfile sysctl

Noticed by: gad
Code stolen from: sparc64


142416 25-Feb-2005 grehan

Add PVO_FAKE flag to pvo entries for PG_FICTITIOUS mappings, to
avoid trying to reverse-map a device physical address to the
vm_page array and walking into non-existent vm weeds.

found by: Xorg server exiting


142414 25-Feb-2005 grehan

Mods for Xorg server:

- store assigned PCI addresses at cninit time for later mmap range
check
- implement set_border to scrub X remnants when switching back to VTYs
- implement mmap, only allowing addresses within the range of the
console adapter.


142107 19-Feb-2005 ru

Use a common multi-inclusion protection, and add such a
protection to alpha/include/exec.h.


141378 06-Feb-2005 njl

Finish the job of sorting all includes and fix the build by including
malloc.h before proc.h on sparc64. Noticed by das@

Compiled on: alpha, amd64, i386, pc98, sparc64


141249 04-Feb-2005 njl

Sort includes a little so that bus.h comes before cpu.h (for device_t).


141237 04-Feb-2005 njl

Add an implementation of cpu_est_clockrate(9). This function estimates the
current clock frequency for the given CPU id in units of Hz.


141229 04-Feb-2005 grehan

- recognize 7447A/7448 CPUs (used in miniMacs)
- enable 745x branch caches. Already enabled by OpenFirmware
on Macs, but reduces NetBSD diffs and usable by embedded folk.

Obtained from: NetBSD


141227 04-Feb-2005 grehan

- add wall_cmos_clock and adjkerntz variables, required by msdosfs
- support adjkerntz sysctl to silence NTP, though it's a null
implementation at the moment.


141226 04-Feb-2005 grehan

Convert bus_space_barrier() into a null inline function rather than an
empty macro to avoid many compile warnings in the USB code.


141225 04-Feb-2005 grehan

- add definitions for MPC7447A/7448 (i.e. miniMac)
- expand MPC745X_P macro to include these

Obtained from: NetBSD


141224 04-Feb-2005 grehan

HID0 updates:
- updated relevant models for High BAT enable bit
- fixed bug in BHTCLR/XAEN constants
- added LRSTK and FOLD bits


141106 01-Feb-2005 grehan

- change all u_int_XX to uint_XX
- cast param for atomic_subtract_long, since Netgraph uses it.


140538 21-Jan-2005 grehan

Fix (accidental?) lock order reversal in pmap_remove. Found when
a process that has mmap'd device mem exits.


140314 15-Jan-2005 scottl

Add bus_dmamap_load_mbuf_sg() to powerpc.


140256 14-Jan-2005 jhb

- Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer
the last action of kern_exit(). Instead, it is a MD callout to cleanup
per-process state during exit.
- Add notes of concern to Alpha and ia64 about the possible need to drop
fp state in cpu_thread_exit() rather than in cpu_exit() since it is
per-thread state rather than per-process.


140049 11-Jan-2005 grehan

- allow a device hint to disable probing a slot on a Uninorth PCI bus.
e.g. at the loader:

set hint.pcib.1.skipslot=26

This allows undocumented and problematic hardware on some systems
to be ignored, for instance, the USB keyboard/mouse that shows up
on a 12" albook that doesn't exist nor do anything other than eat up
the syscons keyboard. Another one is the unused USB cell in the old
366MHz iBook that locks up the machine when probed.

In a way this is temporary, since there are better fixes for the
above problems, but will be useful in the meantime by allowing
a keyboard to be used to help debug said fixes :)

- while here remove some trailing white space


139825 07-Jan-2005 imp

/* -> /*- for license, minor formatting changes


139819 07-Jan-2005 grehan

Return correct value in the lock routine.


139401 29-Dec-2004 grehan

Correctly initialise the 2nd kernel segment, and don't
forget to actually install it in the segment register.
This may fix some of the weird panics seen when kernel VM
is heavily used.


139241 23-Dec-2004 alc

Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.


138897 15-Dec-2004 alc

In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead. To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep. Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock. (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)


138220 30-Nov-2004 grehan

Create a new definition, PSL_KERNSET, which is used for setting the
MSR in kernel mode. Redefine PSL_USERSET in terms of this by or'ing
in PSL_PR.


138129 27-Nov-2004 das

Don't include sys/user.h merely for its side-effect of recursively
including other headers.


137914 20-Nov-2004 das

Remove UAREA_PAGES.

Reviewed by: arch@


137912 20-Nov-2004 das

U areas are going away, so don't allocate one for process 0.

Reviewed by: arch@


137906 20-Nov-2004 das

user.h is included only to get pcb.h, so use the latter directly instead.


137137 02-Nov-2004 andre

Reduce annoying SCSI probing delay from 15 to 5 seconds in all GENRIC kernels.

Discussed on: -current


137122 02-Nov-2004 ssouhlal

Implement TLS relocations for powerpc.

Approved by: grehan (mentor)


137119 02-Nov-2004 ssouhlal

Stay up to date with the latest ATA developments, where
ata_channel.locking now returns an int.

Approved by: grehan (mentor)


137118 02-Nov-2004 ssouhlal

Uncomment options _KPOSIX_PRIORITY_SCHEDULING as it is enabled in the
other architectures, and does not give any problems.

Approved by: grehan (mentor)


137117 01-Nov-2004 jhb

- Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant. The variable
can be examined and changed in ddb as '$lines'. Setting the variable to
0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
newlines and carriage returns so that one can rub out content on the
current line via '\r \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
the routine exits.
- Add some aliases to the simple pager to make it more compatible with
more(1): 'e' and 'j' do a single line. 'd' does half a page, and
'f' does a full page.

MFC after: 1 month
Inspired by: kris


135861 27-Sep-2004 gallatin

Add sc_iostart to softc and unbreak the build.
This was forgotten in my previous commit to add i/o port to uninorth.c

Pointy-hat to: me


135800 26-Sep-2004 gallatin

Add support for i/o-ports. This was cut and pasted from grackle.c


135529 20-Sep-2004 jhb

- Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
every 18 lines. If you hit Enter, it will print another line and prompt
again. If you hit space it will output another page and then prompt.
If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
to print. This is useful if your trace happens to walk back onto
0xdeadc0de and gets stuck in an endless loop.

MFC after: 1 month
Tested on: i386, alpha, sparc64


135172 13-Sep-2004 alc

Lock the kernel pmap in pmap_kenter().

Tested by: gallatin@


134934 08-Sep-2004 scottl

Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform. The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum. However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change. The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)


134791 05-Sep-2004 julian

Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by: scottl, peter
MFC after: 1 week


134571 31-Aug-2004 julian

Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after: 3 days


134568 31-Aug-2004 julian

Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after: 2 days


134535 30-Aug-2004 alc

- Introduce a lock for synchronizing access to the pvo and pteg tables.
- In pmap_enter(), only the acquisition and release of the page queues
lock needs to check the bootstrap flag.

Tested by: gallatin@


134453 28-Aug-2004 alc

Eliminate unnecessary indirection.


134398 27-Aug-2004 marcel

Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
instruction of a functions implementation. It holds the address
of a function descriptor. Hence the user(), btrap(), eintr() and
bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
be layed-out contiguously. This can not be achieved on ia64 and is
generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386


134383 27-Aug-2004 andre

Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over. This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.


134329 26-Aug-2004 alc

Add pmap locking to many of the functions.

Many thanks to Andrew Gallatin for resolving a powerpc-specific
initialization problem in my original patch.

Tested by: gallatin@


133862 16-Aug-2004 marius

Instead of "OpenFirmware", "openfirmware", etc. use the official spelling
"Open Firmware" from IEEE 1275 and OpenFirmware.org (no pun intended).

Ok'ed by: tmm


133855 16-Aug-2004 ssouhlal

Add /dev/mem and /dev/kmem to powerpc.

Approved by: grehan (mentor)


133809 16-Aug-2004 grehan

Advertise that color is supported so that syscons doesn't come up
in monochrome mode when run as init.


133589 12-Aug-2004 marius

- Introduce an ofw_bus kobj-interface for retrieving the OFW node and a
subset ("compatible", "device_type", "model" and "name") of the standard
properties in drivers for devices on Open Firmware supported busses. The
standard properties "reg", "interrupts" und "address" are not covered by
this interface because they are only of interest in the respective bridge
code. There's a remaining standard property "status" which is unclear how
to support properly but which also isn't used in FreeBSD at present.
This ofw_bus kobj-interface allows to replace the various (ebus_get_node(),
ofw_pci_get_node(), etc.) and partially inconsistent (central_get_type()
vs. sbus_get_device_type(), etc.) existing IVAR ones with a common one.
This in turn allows to simplify and remove code-duplication in drivers for
devices that can hang off of more than one OFW supported bus.
- Convert the sparc64 Central, EBus, FHC, PCI and SBus bus drivers and the
drivers for their children to use the ofw_bus kobj-interface. The IVAR-
interfaces of the Central, EBus and FHC are entirely replaced by this. The
PCI bus driver used its own kobj-interface and now also uses the ofw_bus
one. The IVARs special to the SBus, e.g. for retrieving the burst size,
remain.
Beware: this causes an ABI-breakage for modules of drivers which used the
IVAR-interfaces, i.e. esp(4), hme(4), isp(4) and uart(4), which need to be
recompiled.
The style-inconsistencies introduced in some of the bus drivers will be
fixed by tmm@ in a generic clean-up of the respective drivers later (he
requested to add the changes in the "new" style).
- Convert the powerpc MacIO bus driver and the drivers for its children to
use the ofw_bus kobj-interface. This invloves removing the IVARs related
to the "reg" property which were unused and a leftover from the NetBSD
origini of the code. There's no ABI-breakage caused by this because none
of these driver are currently built as modules.
There are other powerpc bus drivers which can be converted to the ofw_bus
kobj-interface, e.g. the PCI bus driver, which should be done together
with converting powerpc to use the OFW PCI code from sparc64.
- Make the SBus and FHC front-end of zs(4) and the sparc64 eeprom(4) take
advantage of the ofw_bus kobj-interface and simplify them a bit.

Reviewed by: grehan, tmm
Approved by: re (scottl)
Discussed with: tmm
Tested with: Sun AX1105, AXe, Ultra 2, Ultra 60; PPC cross-build on i386


133521 11-Aug-2004 marius

- Use the rman_get_* functions instead of reaching into struct resource.
- Remove __RMAN_RESORUCE_VISIBLE again. It's no longer required either
because of the above change or because struct rman is no longer hidden.

Reviewed by: grehan
Tested by: cross-compile on i386


133464 11-Aug-2004 marcel

Add __elfN(dump_thread). This function is called from __elfN(coredump)
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.

Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc


133239 07-Aug-2004 grehan

Always isync after a mtmsr. While perhaps not strictly necessary for PSL_EE
bit banging according to the OEA, it's better to be conservative than
having to continually audit uses of this inline.


133166 05-Aug-2004 grehan

In pmap_page_protect, clear the vm page's PG_WRITEABLE flag if
downgrading to read-only. Found by triggering the KASSERT in
vm_pageout_flush().


133143 04-Aug-2004 alc

- Push down the acquisition and release of Giant into pmap_enter_quick()
on those architectures without pmap locking.
- Eliminate the acquisition and release of Giant in vm_map_pmap_enter().


133087 03-Aug-2004 markm

Making a loadable null.ko for /dev/(null|zero) proved rather
unpopular, so remove this (mis)feature.

Encouragement provided by: jhb (and others)


133084 03-Aug-2004 mux

Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait(). It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by: jhb
Reviewed by: jhb


133050 03-Aug-2004 grehan

Remove race condition between reading of MSR, setting md_savecrit,
and setting MSR. This was most evident with the idle proc running
with interrupts disabled and causing a lockup. Switch over to the
i386 style which does things in the right order.

debug assisted by: gallatin, and the invaluable KTR option.


133007 02-Aug-2004 ssouhlal

Remove 'device mem' from GENERIC, which markm@ mistakingly added.
We don't have mem/kmem yet.

Approved by: grehan (mentor)


132994 02-Aug-2004 grehan

Kernel traps were not being passed to trap_fatal in some
circumstances.

Spotted by: gallatin


132956 01-Aug-2004 markm

Break out the MI part of the /dev/[k]mem and /dev/io drivers into
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.


132899 30-Jul-2004 alc

- Push down the acquisition and release of Giant into pmap_protect() on
those architectures without pmap locking.
- Eliminate the acquisition and release of Giant from vm_map_protect().

(Translation: mprotect(2) runs to completion without touching Giant on
alpha, amd64, i386 and ia64.)


132838 29-Jul-2004 ssouhlal

Add comment explaining struct reg and struct fpreg must match the trapframe.

Approved by: grehan (mentor)


132836 29-Jul-2004 ssouhlal

Implement MD parts of ptrace.

Approved by: grehan (mentor)


132700 27-Jul-2004 rwatson

Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread. It is called only from critical_{enter,exit}(),
which already dereferences curthread. This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding: jhb, bmilekic


132689 27-Jul-2004 grehan

Properly implement kdb_cpu_{set|clear}_singlestep to allow DDB to
continue from breakpoints.


132688 27-Jul-2004 grehan

Make sure icache is sync'd whenever memory is touched. It may
be more optimal to override the BKPT_WRITE macro, but DDB performance
isn't really a goal at this stage...


132683 27-Jul-2004 grehan

Save DAR/DSISR in DDB regsave area when stack overflow detected. It's
hard to work out where the problem was without these.


132681 27-Jul-2004 grehan

Improve boot-time debugging with DDB by extracting the ksym start/end
values from the loader.


132666 26-Jul-2004 alc

Implement the protection check required by the pmap_extract_and_hold()
specification.

Reviewed and tested by: grehan@


132580 23-Jul-2004 gallatin

Let ddb know powerpc is big endian so as to make ddb output
human readable.

Obtained from: sparc64/include/db_machdep.h


132571 23-Jul-2004 grehan

Detect kernel stack excursion into guard pages. Drop into KDB
with a wired stack if this is found.

Mostly obtained from: NetBSD


132570 23-Jul-2004 grehan

Bring KDB stack size into line with thread stack size (4 pages).


132569 23-Jul-2004 grehan

Allow DSI exceptions to invoke DDB.


132562 23-Jul-2004 grehan

The ADDR16 relocations were assuming that non-local symbols had an
addend of 0. This isn't correct, and was quite easy to break by
referring to the address of an element within a structure.

However, fixing this exposed the fact that symbol lookups for
local variables were returning the base of the section they
were contained in. This case is detected by comparing the return
value from elf_lookup() to the relocbase+addend value: if it is
lesser, but greater than relocbase, then relocbase+addend is
taken to be the authoritative value.

bug reported by: gallatin


132520 22-Jul-2004 grehan

Update the callframe structure to leave space for the frame pointer
and saved link register as per the ABI call sequence. Update code
that uses this (fork_trampoline etc) to use the correct genassym'd
offsets.

This fixes the 'invalid LR' message when backtracing kernel
threads in DDB.


132519 22-Jul-2004 gallatin

Make this compile: add sys/module.h and KDBify.


132482 21-Jul-2004 marcel

Unify db_stack_trace_cmd(). All it did was look up the thread given
the thread ID and call db_trace_thread().
Since arm has all the logic in db_stack_trace_cmd(), rename the
new DB_COMMAND function to db_stack_trace to avoid conflicts on
arm.
While here, have db_stack_trace parse its own arguments so that
we can use a more natural radix for IDs. If the ID is not a thread
ID, or more precisely when no thread exists with the ID, try if
there's a process with that ID and return the first thread in it.
This makes it easier to print stack traces from the ps output.

requested by: rwatson@
tested on: amd64, i386, ia64


132428 20-Jul-2004 grehan

elf_cpu_load_file no longer has an __unused variable. Also, don't
bother syncing the icache for the special case of the kernel (id == 1),
since the loader has already done this.

__unused use reported by: gallatin


132426 20-Jul-2004 grehan

Properly obey PPC context synchronization rules when modifying
the address translation bits of the MSR. This fixes the boot-time
panic reported by Drew Gallatin.


132421 19-Jul-2004 gallatin

Fix printing of long doubles to match the size that
gcc is using. This fixes devstat consumers (like vmstat, iostat,
systat) so they don't print crazy zillion digit numbers for
disk transfers and bandwidth.

According to gcc, long doubles are 64-bits, rather than 128 bits
like the SVR4 ABI spec wants them to be.. Note that MacOSX also treats
long doubles as 64-bits, and not 128 bits, so we are in good company.

Reviewed by: das
Approved by: grehan


132383 19-Jul-2004 das

Make FLT_ROUNDS correctly reflect the dynamic rounding mode.


132380 19-Jul-2004 grehan

Use the version field to identify the partial context used by
KSE process-scope threads.


132375 19-Jul-2004 grehan

Empty GENERIC.hints file needed by make release.

Noticed by: Suleiman Souhlal <refugee@segfaulted.com>


132345 18-Jul-2004 maxim

In -CURRENT pseudo devices are not statically assigned at compile time,
remove a stale comment.

PR: kern/62285


132282 17-Jul-2004 grehan

Resurrect kld support. Support ADDR16_HA/LA relocations, and sync
the icache on module load. Requires "-mlongcall" support, in gcc >= 3.3
but needs a bugfix to support gcc arith builtins.


132220 15-Jul-2004 alc

Push down the acquisition and release of the page queues lock into
pmap_protect() and pmap_remove(). In general, they require the lock in
order to modify a page's pv list or flags. In some cases, however,
pmap_protect() can avoid acquiring the lock.


132088 13-Jul-2004 davidxu

Add ptrace_clear_single_step(), alpha already has it for years, the function
will be used by ptrace to clear a thread's single step state.


132075 12-Jul-2004 grehan

Rename low-level code ddb -> db. Use KDB instead of DDB.
Fix bug in setup of stack frame where 8 bytes wasn't being
saved for the callee's frame pointer and saved LR.


132074 12-Jul-2004 grehan

Bring into KDB new order.


132073 12-Jul-2004 grehan

- DDB -> KDB, with kdb routines
- ddb -> db for low-level trapcode
- implement makectx. I think it only matters that the stack is setup
correctly.
- bring over ddb_trap_glue and rename to db_trap_glue


132072 12-Jul-2004 grehan

No need for ddb option. Never a need for ipkdb option.


132071 12-Jul-2004 grehan

Catch up with gratuitous ddb -> db renaming


132070 12-Jul-2004 grehan

Bring into line with KDB. Bring in NetBSD updates for backtrace routine,
although it really needs a decent re-work.


132069 12-Jul-2004 grehan

Remove unused NetBSD code. Bring mem r/w routines into here in line
with sparc64, although keep the size deref checks: they are useful
when accessing PCI space where some devices may not implement byte
access.


132068 12-Jul-2004 grehan

Gratuitous namechange to avoid low-level association with ddb.


132067 12-Jul-2004 grehan

Add prototype for KDB's makectx routine


132066 12-Jul-2004 grehan

Remove old NetBSD-derived unused code and stuff that is now obsolete
due to KDB.


132065 12-Jul-2004 grehan

DDB -> KDB, and rename low-level trap handler to avoid name conflict.


132064 12-Jul-2004 grehan

kdb.h for PowerPC. Stubs for now.


132063 12-Jul-2004 grehan

Add new KDB option, and also drop in long-held fxp/dc eth drivers.


132011 12-Jul-2004 alc

pmap_remove_pages() must not remove wired mappings. Since
pmap_remove_pages() is an optimization, its implementation is optional.

Discussed with: grehan


131867 09-Jul-2004 grehan

- correctly set the return value for the copyin/out fault buffer to 1
so setfault would return correctly when a page fault was invalid
(e.g. a syscall with a bad parameter).

This caused an endless DSI loop, seen when running sendmail which
does a setlogin() call with a NULL pointer.

- introduce KTR_SYSC tracing. expose the syscallnames[] array to
make the tracing more readable.


131808 08-Jul-2004 grehan

G4 requires isync after 256Mb ibat/dbat update, G3 requires
isync after each bat update. Otherwise, pmap_bootstrap causes
an ISI exception. A fall-out of loader BAT removal.


131698 06-Jul-2004 grehan

- trailing white-space cleanup
- add call to thread_user_enter for P_SA processes before
trap processing ala all other arches


131687 06-Jul-2004 obrien

In the spirit of amd64/include/stdarg.h rev 1.6; add __va_copy
(but keep it conditional on __ISO_C_VISIBLE >= 1999.

Why? Our out /usr/src/contrib assumes it, and more than a few ports have
an autoconf that looks for __va_copy because it is available on glibc.
It is critical that we use it on PowerPC. It generally isn't a problem
for i386 and its ilk because those platforms can get away with cheating
the C standard, using a plain assignment.


131671 06-Jul-2004 grehan

Add 32-bit framebuffer support. Tested on PearPC at lo-res, too painful
to watch at hi-res.


131658 05-Jul-2004 alc

Correct pmap_extract()'s return type. It should be vm_paddr_t, not
vm_offset_t.


131481 02-Jul-2004 jhb

Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
determine if a thread about to be added to a run queue should be
preempted to directly. If it is not safe to preempt or if the new
thread does not have a high enough priority, then the function returns
false and sched_add() adds the thread to the run queue. If the thread
should be preempted to but the current thread is in a nested critical
section, then the flag TDF_OWEPREEMPT is set and the thread is added
to the run queue. Otherwise, mi_switch() is called immediately and the
thread is never added to the run queue since it is switch to directly.
When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
setrunqueue() now does all the correct work. This also removes the
do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
chance to run if the architecture supports native preemption since
the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by: scottl (with his re@ hat)


131401 01-Jul-2004 grehan

Modify loop test when cycling through phys_avail array. It's possible
for an OpenFirmware implementation to have a single memory region
(hello PearPC).


131400 01-Jul-2004 grehan

Catch up with __RMAN_RESOURCE_VISIBLE change


131399 01-Jul-2004 grehan

Move soft structs back to C files to avoid exposing rman fields
to clients now that it's protected with __RMAN_RESOURCE_VISIBLE


131102 25-Jun-2004 grehan

Catchup to now-required <sys/module.h> for PowerPC


130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


130028 03-Jun-2004 tjr

Remove checks for curthread == NULL - it can't happen.


130023 03-Jun-2004 tjr

Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by: jhb


129750 26-May-2004 tmm

Retire cpu_sched_exit(); it is not used any more.


129444 19-May-2004 bde

Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.


129416 19-May-2004 grehan

trap_pfault() shouldn't be acquiring Giant. Found to blow up
with MUTEX_PROFILING.

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>


129393 18-May-2004 stefanf

<stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is
defined. Otherwise first including <wchar.h> and then <stdint.h> leads to no
WINT_M{AX,IN} at all.

PR: 64956
Approved by: das (mentor)


129282 16-May-2004 peter

Make a small revision to the api between the elf linker core and the
elf_reloc() backends for two reasons. First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).


128845 02-May-2004 marcel

Add option GEOM_GPT. This brings the ability to have a large number of
partitions on a single disk.


128838 02-May-2004 obrien

Spell Ethernet correctly.


128629 25-Apr-2004 das

Hide FLT_EVAL_METHOD and DECIMAL_DIG in pre-C99 compilation
environments.

PR: 63935
Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>


128595 23-Apr-2004 grehan

- Catch up with recent ATA changes.
- Remove trailing space in ata_macio.c


128540 21-Apr-2004 grehan

Include <machine/pte.h> since it has been removed from <machine/param.h>.


128539 21-Apr-2004 grehan

<machine/pte.h> has no business being here. Finally exposed by F77 build
failure.


128395 18-Apr-2004 alc

MFamd64
Simplify the sf_buf implementation. In short, make it a veneer
over the direct virtual-to-physical mapping.


128103 11-Apr-2004 alc

Remove avail_end. It is not used.


127977 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


127875 05-Apr-2004 alc

Remove avail_start on those platforms that no longer use it. (Only amd64
does anything with it beyond simple initialization.)


127869 05-Apr-2004 alc

Remove unused arguments from pmap_init().


127788 03-Apr-2004 alc

In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it
should not. Add a new parameter so that the caller can specify which is
the case.

Reported by: dillon


127703 01-Apr-2004 grehan

Match the specific MPC106 host bridge PCI ID rather than all
generic host bridges: this avoids a race with the UniNorth
generic match.


127659 31-Mar-2004 grehan

The end argument to bus_alloc_resource() should have been ~0 and
not ~1, but the call has been switched over to bus_alloc_resource_any()
which has the same effect.

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>


127618 30-Mar-2004 benno

Replace td2 with td on the assumption that this was a typo. This should at
least unbreak the build.

Pointy hat to: peter
Not tested either by: benno


127582 29-Mar-2004 peter

Finish tidying up a couple of leftovers from the KSTACK_PAGES stuff. Some
files still #included the opt_ file. powerpc hadn't been updated yet.


127333 23-Mar-2004 alc

Add an implementation of uiomove_fromphys() for PowerPC. This
implementation uses the direct virtual-to-physical mapping.

Discussed with: grehan


127239 20-Mar-2004 marcel

Introduce the cpumask_t type. The purpose of the type is to create a
level of abstraction for any and all CPU mask and CPU bitmap variables
so that platforms have the ability to break free from the hard limit
of 32 CPUs, simply because we don't have more bits in an u_int. Note
that the type is not supposed to solve massive parallelism, where
the number of CPUs can be larger than the width of the widest integral
type. As such, cpumask_t is not supposed to be a compound type. If
such would be necessary in the future, we can deal with the issues
then and there. For now, it can be assumed that the type is integral
and unsigned.

With this commit, all MD definitions start off as u_int. This allows
us to phase-in cpumask_t at our leasure without breaking anything.
Once cpumask_t is used consistently, platforms can switch to wider
(or smaller) types if such would be beneficial (or not; whatever :-)

Compile-tested on: i386


127135 17-Mar-2004 njl

Convert callers to the new bus_alloc_resource_any(9) API.

Submitted by: Mark Santcroos <marks@ripe.net>
Reviewed by: imp, dfr, bde


127086 16-Mar-2004 alc

Refactor the existing machine-dependent sf_buf_free() into a machine-
dependent function by the same name and a machine-independent function,
sf_buf_mext(). Aside from the virtue of making more of the code machine-
independent, this change also makes the interface more logical. Before,
sf_buf_free() did more than simply undo an sf_buf_alloc(); it also
unwired and if necessary freed the page. That is now the purpose of
sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used
as a general-purpose emphemeral map cache.


126919 13-Mar-2004 scottl

Now that contigfree() does not require Giant, don't grab it in busdma.


126728 07-Mar-2004 alc

Retire pmap_pinit2(). Alpha was the last platform that used it. However,
ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps,
there has been no reason for having this function on Alpha. Briefly,
when pmap_growkernel() relied upon the list of all processes to find and
update the various pmaps to reflect a growth in the kernel's valid
address space, pmap_init2() served to avoid a race between pmap
initialization and pmap_growkernel(). Specifically, pmap_pinit2() was
responsible for initializing the kernel portions of the pmap and
pmap_pinit2() was called after the process structure contained a pointer
to the new pmap for use by pmap_growkernel(). Thus, an update to the
kernel's address space might be applied to the new pmap unnecessarily,
but an update would never be lost.


126649 05-Mar-2004 le

Fix syntax errors and wrong function prototypes in several MD header
files when using non-GNUC compilers.

PR: kern/58515
Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>
Approved by: grog (mentor), obrien


126478 02-Mar-2004 grehan

Increase kernel VA from 256Mb to 512Mb by shifting the segment used
for user copyinout down to 12, and keeping segments 13/14 for
kernel VA.

It would be nice to have more available, but segments lower than
this are reserved for either memory or 1:1 mapped device i/o,
and seg 15 is OpenFirmware ROM. Also, the effort to keep OpenFirmware
available for callbacks limits the use of VA-mapped segments.
Fortunately UMA_MD_SMALL_ALLOC takes away a lot of VM pressure.

Obtained from: NetBSD


126474 02-Mar-2004 grehan

Kernel changes for libthr (and probably libpthread).

include/ucontext.h
- remove trapframe and switch over to 'generic' description of machine
state. Include version field to help with future modifications.
Include floating point and altivec state, and hopefully align
correctly

powerpc/copyinout.c
- fill out casuptr() sync primitive, required by kern_umtx.c

powerpc/machdep.c
- shifted proc0/thread0/pcpu setup to before cninit, since
syscons -> make_dev -> devlock requires a valid curthread
- implemented get_mcontext/set_mcontext
- recast sendsig/sigreturn to use get/set_mcontext and new
ucontext struct. floating point now saved
- TODO: save/restore altivec state

powerpc/vm_machdep.c
- implemented cpu_thread_setup/cpu_set_upcall/cpu_set_upcall_kse
- eliminated trailing whitespace

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>, ucontext by grehan


126394 29-Feb-2004 grehan

Bring to working PIO state.
- use correct rid when allocating PCI mem resource
- ATA taskfile registers are indeed spaced 0x10 apart just like
the Macio ATA cell. Adjust offsets in ATA channel struct.

Tested by: Suleiman Souhlal <ssouhlal@vt.edu>


125735 12-Feb-2004 grehan

Work-in-progress for the 'Kauai' ATA device in Mac notebooks. The
device seems to be the macio ATA cell with a PCI front-end, and
has no relation to PIIX-style ATA/PCI devices.


125734 12-Feb-2004 grehan

Add sys file required for IEEE fp functions.

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>


125708 11-Feb-2004 grehan

Interrupt statistics, vmstat -i now works.

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>
Slightly modified by: grehan
Derived from: i386


125705 11-Feb-2004 grehan

Clean up header files, which fixes compile warning.


125702 11-Feb-2004 grehan

- constify devinfo strings to eliminate compile warning
- remove trailing whitespace


125691 11-Feb-2004 grehan

- fix compile warnings
- removed obsolete NetBSD-derived ADB conditionals


125690 11-Feb-2004 grehan

- fixed trailing whitespace and indentation
- removed unused variable to fix compile warning


125689 11-Feb-2004 grehan

Fix compile warning


125688 11-Feb-2004 grehan

- remove trailing whitespace
- fix compile warnings. badaddr() will go to a header file soon.


125687 11-Feb-2004 grehan

Cleaned up param.h:

- culled long-dead #define's
- segment register defs moved to sr.h
- NPMAPS moved to pmap.h
- KERNBASE moved to vmparam.h
- removed include of <machine/cpu.h> and fixed src files that
relied on this.

Modifying segment register code no longer causes gcc rebuilds :-)


125686 11-Feb-2004 grehan

Invalide pcb's fpu cpu # by setting it to INT_MAX, not NULL.


125682 11-Feb-2004 grehan

Add sysctl hw.uma_mdpages to track how many pages have been allocated
by UMA_MD_SMALL_ALLOC


125677 10-Feb-2004 grehan

Correctly create interrupt key for PCI, which is the OpenFirmware
pci-hi/med/lo + node 'interrupts' property. This worked by
accident until recent notebooks required correct operation.

Tested by: Suleiman Souhlal <refugee@segfaulted.com>


125617 09-Feb-2004 grehan

Disable branch-target instruction cache on MPC7457 as outlined
in Motorola processor errata.

Submitted by: Suleiman Souhlal <refugee@segfaulted.com>


125615 09-Feb-2004 grehan

Recognize MPC7547 (aka G4+)


125614 09-Feb-2004 grehan

Definitions for MPC7457 CPU type and HID0 bits


125443 04-Feb-2004 grehan

- add a description of what .gdbinit should contain.
- add an option for the output device in the hope that this can
be made non-blocking at some stage.
- define an alias for the disk device, required by dev/ofw/ofw_disk.c
- shift iobus to 0x9000000 so as not to clash with the OpenFirmware
entry point of 0x8000400 when address decoding.
- down-tone comments about the disk dev config :-)


125442 04-Feb-2004 grehan

Remove pmap_pvo_allocf zone alloc function. It was a way of
using the direct-mapping of physmem to force PTE data structures
to be physically addressable so the interrupt-time real-mode
DSI trap handler could perform PTE spills. However, the memory
may have been > 256Mb, which would have caused a BAT spill and
double-interrupt.

The new trap code no longer handles PTE spills, so the requirement
that these pages be direct-mapped no longer applies. The irony is
UMA_MD_SMALL_ALLOC will return direct mappings for these structs :-)


125441 04-Feb-2004 grehan

Major overhaul of common trap code
- remove unused 601 and tlb exception code
- remove interrupt-time PTE spill code. The pmap code
will now take care of pinning kernel PTEs, and there
are no longer issues about physical mapping of PTE
data structures
- All segment registers are switched on kernel entry/exit,
allowing the kernel to have more virtual space and for
user virtual space to extend to 4G.
- The temporary register save area has been shifted from
unused exception vector space to the per-cpu data area.
This allows interrupts to be delivered to multiple CPUs
- ISI traps no longer spill to BAT tables. It is assumed
that all of kernel instruction memory is pinned.
- shift from 'ldmw/stmw' instructions to individual register
loads/stores when saving context. All PPC manuals indicate
this should be much faster.
- use '%r' for register names throughout.

TODO: need to test if DSI traps were the result of kernel stack
guard-page hits.

Reworked from: NetBSD


125440 04-Feb-2004 grehan

- remove unused trap definitions
- ISI traps are now handled by the generic trap routine
- direct diagnostic traps to DDB if defined
- remove unused asngen pcpu init


125439 04-Feb-2004 grehan

- Lots more symbols required by the new trap_subr code


125438 04-Feb-2004 grehan

- Add definition for GET_CPUINFO, required by new trap_subr code
- garbage-collect unused defs


125437 04-Feb-2004 grehan

Move temporary register save area from exception-vector memory to
per-CPU memory. This allows for interrupt handling on multiple CPUs.

Obtained from: NetBSD


125434 04-Feb-2004 grehan

Allow child devices to set the OpenFirmware device node ivar


125414 04-Feb-2004 grehan

- removed debug printf that was a false positive on non-OpenPIC systems
- white space nits


125378 03-Feb-2004 grehan

Use device alias "mpic" to locate the macio OpenPIC. This works
on the new 12/15/17" PowerBooks that don't have the "interrupt-controller"
property underneath "/chosen", which was the previous way of
searching.


125185 29-Jan-2004 grehan

When UMA_MD_SMALL_ALLOC is defined, pmap_kextract will be called
for direct-mapped addresses. Assume that any address less than KVA
is one of these and return it. Also assert that an address is KVA
does have a valid mapping - callers of pmap_kextract don't check
the return value, since they assume that they have a valid virtual
address.


125184 29-Jan-2004 grehan

Implement UMA_MD_SMALL_ALLOC, since the BAT registers allow direct
addressing of memory. Makes a substantial improvement for apps that
stress the limited amount of KVM on PPC (e.g. untarring the ports tree).

uma_machdep.c stolen from amd64/ia64.


124935 24-Jan-2004 jeff

- Recruit some new ULE users by making it the default scheduler in GENERIC.
ULE will be in a probationary period to determine whether it will be left
as the default in 5.3 which would likely mean the rest of the 5.x series.


124919 24-Jan-2004 nectar

Add PFIL_HOOKS to the GENERIC kernel configuration, primarily so
that one can load the IPFilter module (which requires PFIL_HOOKS).

Requested by: Many, for over a year


124775 21-Jan-2004 grehan

Add syscons options and enable USB, since there is no conflict between
the OpenFirmware console and the syscons console when using a USB
keyboard.


124772 21-Jan-2004 grehan

- Catch up with panic __LINE__/__FILE__ changes by moving panic calls
out of asm.
- remove some long-dead code from machdep.c


124771 21-Jan-2004 grehan

A syscons implementation using the 8-bit framebuffer set up by
OpenFirmware. Not at all optimized, but provides a PC-style
user-experience.

Tested on revA imac, B&W G3, 2k iBook, and G4 eMac.


124768 21-Jan-2004 grehan

Update 128-bit long double constants to match what is expected
by libc


124581 15-Jan-2004 grehan

Catch up with ATA UMA changes


124470 13-Jan-2004 grehan

Use a device identify entry point to attach to nexus, since the
nexus code no longer searches for interrupt controllers.


124469 13-Jan-2004 grehan

Make the OpenPic driver bus-independent, with attachments for
the MacIO chip and PSIM's IOBus. Bus-specific drivers should
use the identify method to attach themselves to nexus so
interrupt can be allocated before the h/w is probed. The
'early attach' routine in openpic is used for this stage
of boot. When h/w is probed, the openpic can be attached
properly. It will enable interrupts allocated prior to
this.


124468 13-Jan-2004 grehan

Remove hard-coded knowledge of specific OFW devices. Use bus_generic_probe
and add_child entry point to allow devices to use the identify
method to add themselves if need be (e.g. openpic, syscons).
Export interrupt-controller-add routine for extern int cntlr drivers.
Eliminate recursive OFW device-tree walk and only iterate the
top-level ala sparc64. Allow child devices to set the device
type with write_ivars.

Step 1 of many in removing the hard-dependency on OpenFirmware.


124466 13-Jan-2004 grehan

Catch up with ATA changes by including <sys/sema.h>


124092 03-Jan-2004 davidxu

Make sigaltstack as per-threaded, because per-process sigaltstack state
is useless for threaded programs, multiple threads can not share same
stack.
The alternative signal stack is private for thread, no lock is needed,
the orignal P_ALTSTACK is now moved into td_pflags and renamed to
TDP_ALTSTACK.
For single thread or Linux clone() based threaded program, there is no
semantic changed, because those programs only have one kernel thread
in every process.

Reviewed by: deischen, dfr


123929 28-Dec-2003 silby

Track three new sendfile-related statistics:
- The number of times sendfile had to do disk I/O
- The number of times sfbuf allocation failed
- The number of times sfbuf allocation had to wait


123920 28-Dec-2003 silby

Move the declaration of sfbufspeak and sfbufsused to mbuf.h,
and use imax instead of max, as sfbufspeak and sfbufsused
are signed.

Submitted by: bde


123884 27-Dec-2003 silby

Track current and peak sfbuf usage, export the values via sysctl.


123791 24-Dec-2003 peter

GC the unused <machine/kse.h> file.


123742 23-Dec-2003 peter

Add an additional field to the elf brandinfo structure to support
quicker exec-time replacement of the elf interpreter on an emulation
environment where an entire /compat/* tree isn't really warranted.


123560 16-Dec-2003 grehan

Disable the per-vm_page PTE cache. This was not being invalidated
correctly, resulting in the dreaded "vm_pageout_flush: partially
invalid page" panic. The caching issue will be revisited in the
future, but opt for safety over performance in the meantime.

Tested by: gallatin


123371 10-Dec-2003 grehan

imac revA-D and beige G3 OpenFirmware uses the "ide" string for
ATA drives.


123370 10-Dec-2003 grehan

- removed obsolete ppc_exit/ppc_boot functions
- OpenFirmware returns overlapping memory regions. Use a simple
brute force algorithm to merge these into non-overlapping
regions. This fixes bugs in reporting of available memory
and also prevents pages from being added twice in the VM system.


123357 09-Dec-2003 gallatin

Remove redundant declaration of ddb_trap


123354 09-Dec-2003 gallatin

pmap_query_bit() should return false if the bit is not set.

Reviewed by: grehan


123353 09-Dec-2003 gallatin

Use the "shut-down" and "reset-all" Forth procedures to halt and
reboot, as calling OF_exit() just hangs a mac.

FreeBSD on my G4 800Mhz mac behaves identically to OSX for halt
and reboot now.

Reviewed by: grehan (who also supplied the concept and sample code)


123352 09-Dec-2003 gallatin

Make breakpoint() actually break into ddb.

Reviewed by: grehan


122947 21-Nov-2003 jhb

- Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called
very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid.
cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is
actually present and sets mp_ncpus and all_cpus. Splitting these up
allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just
setting mp_maxid to MAXCPU in cpu_mp_setmaxid(). This could allow the
CPU probing code to live in a module, for example, since modules
sysinit's in modules cannot be invoked prior to SI_SUB_KLD. This is
needed to re-enable the ACPI module on i386.
- For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating
its contents in a few places. Also, add a smp_cpu_enabled() function
to avoid duplicating some code. There is room for further code
reduction later since much of this code is also present in cpu_mp_start().
- All archs besides i386 still set mp_maxid to the same values they set it
to before this change. i386 now sets mp_maxid to MAXCPU.

Tested on: alpha, amd64, i386, ia64, sparc64
Approved by: re (scottl)


122841 17-Nov-2003 peter

Widen the enable/disable helper function's argument in line with the
ithread_create() changes etc. This should be mostly a NOP.


122821 16-Nov-2003 alc

- Remove unnecessary synchronization from sf_buf_init(). (There is only
one active CPU when sf_buf_init() is performed.)


122780 16-Nov-2003 alc

- Modify alpha's sf_buf implementation to use the direct virtual-to-
physical mapping.
- Move the sf_buf API to its own header file; make struct sf_buf's
definition machine dependent. In this commit, we remove an
unnecessary field from struct sf_buf on the alpha, amd64, and ia64.
Ultimately, we may eliminate struct sf_buf on those architecures
except as an opaque pointer that references a vm page.


122364 09-Nov-2003 marcel

Change the clear_ret argument of get_mcontext() to be a flags argument.
Since all callers either passed 0 or 1 for clear_ret, define bit 0 in
the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI
code for possible (but unlikely) future use. The remaining bits are for
use by MD code.

This change is triggered by a need on ia64 to have another knob for
get_mcontext().


121237 19-Oct-2003 peter

Add a stub cpu_idle() function for sparc64, alpha, powerpc. This is a
MI declared function so it should be everywhere.


120831 06-Oct-2003 bms

Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by: truckman
Discussed with: peter


120722 03-Oct-2003 alc

Migrate pmap_prefault() into the machine-independent virtual memory layer.

A small helper function pmap_is_prefaultable() is added. This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have
much in common. Going forward, it's worth considering their merger.


120460 26-Sep-2003 grehan

DELAY must be a routine, not a macro definition.


120422 25-Sep-2003 peter

Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function. Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable. This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'. And that causes exec etc to fail since it can no
longer find space to mmap things.


120395 24-Sep-2003 grehan

_MACHINE/_MACHINE_ARCH shouldn't be quoted. Found by trying to
compile the isp driver.


120336 22-Sep-2003 grehan

Soften assert in pmap_remove_all.
Introduct pmap_extract_and_hold.

Stolen from: sparc64


120335 22-Sep-2003 grehan

ATAng requires <sys/taskqueue.h>


119628 01-Sep-2003 kan

Standardize idempotentcy ifdefs. Consistently use _MACHINE_VARARGS_H_
symbol.


119563 29-Aug-2003 alc

Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy
sockets into machine-dependent files. The rationale for this
migration is illustrated by the modified amd64 allocator. It uses the
amd64's direct map to avoid emphemeral mappings in the kernel's
address space. On an SMP, the emphemeral mappings result in an IPI
for TLB shootdown for each transmitted page. Yuck.

Maintainers of other 64-bit platforms with direct maps should be able
to use the amd64 allocator as a reference implementation.


119291 22-Aug-2003 imp

Prefer new location of pci include files (which have only been in the
tree for two or more years now), except in a few places where there's
code to be compatible with older versions of FreeBSD.


119015 17-Aug-2003 gordon

Fixup the ELF branding information to point to the new home of rtld.


119004 16-Aug-2003 marcel

In vm_thread_swap{in|out}(), remove the alpha specific conditional
compilation and replace it with a call to cpu_thread_swap{in|out}().
This allows us to add similar code on ia64 without cluttering the
code even more.


118990 16-Aug-2003 marcel

Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)


118893 14-Aug-2003 grehan

Update powerpc to use the (old thread,new thread) calling convention
for cpu_throw() and cpu_switch().


118848 12-Aug-2003 imp

Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon


118443 04-Aug-2003 jhb

- Since td_critnest is now initialized in MI code, it doesn't have to be
set in cpu_critical_fork_exit() anymore.
- As far as I can tell, cpu_thread_link() has never been used, not even
when it was originally added, so remove it.


118383 03-Aug-2003 obrien

Deal with GCC annoyingly defining _BIG_ENDIAN.


118365 02-Aug-2003 alc

Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in pmap_mapdev().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by: tegge


118244 31-Jul-2003 bmilekic

Make sure that when the PV ENTRY zone is created in pmap, that it's
created not only with UMA_ZONE_VM but also with UMA_ZONE_NOFREE. In
the i386 case in particular, the pmap code would hook a special
page allocation routine that allocated from kernel_map and not kmem_map,
and so when/if the pageout daemon drained the zones, it could actually
push out slabs from the PV ENTRY zone but call UMA's default page_free,
which resulted in pages allocated from kernel_map being freed to
kmem_map; bad. kmem_free() ignores the return value of the
vm_map_delete and just returns. I'm not sure what the exact
repercussions could be, but it doesn't look good.

In the PAE case on i386, we also set-up a zone in pmap, so be
conservative for now and make that zone also ZONE_NOFREE and
ZONE_VM. Do this for the pmap zones for the other archs too,
although in some cases it may not be entirely necessarily. We'd
rather be safe than sorry at this point.

Perhaps all UMA_ZONE_VM zones should by default be also
UMA_ZONE_NOFREE?

May fix some of silby's crashes on the PV ENTRY zone.


118239 31-Jul-2003 peter

Deal with 'options KSTACK_PAGES' being a global option.


118100 27-Jul-2003 alc

Make pmap_pvo_allocf() callable without Giant.


118081 27-Jul-2003 mux

- Introduce a new busdma flag BUS_DMA_ZERO to request for zero'ed
memory in bus_dmamem_alloc(). This is possible now that
contigmalloc() supports the M_ZERO flag.
- Remove the locking of Giant around calls to contigmalloc() since
contigmalloc() now grabs Giant itself.


117600 15-Jul-2003 davidxu

Rename thread_siginfo to cpu_thread_siginfo.

Suggested by: jhb


117206 03-Jul-2003 alc

Background: pmap_object_init_pt() premaps the pages of a object in
order to avoid the overhead of later page faults. In general, it
implements two cases: one for vnode-backed objects and one for
device-backed objects. Only the device-backed case is really
machine-dependent, belonging in the pmap.

This commit moves the vnode-backed case into the (relatively) new
function vm_map_pmap_enter(). On amd64 and i386, this commit only
amounts to code rearrangement. On alpha and ia64, the new machine
independent (MI) implementation of the vnode case is smaller and more
efficient than their pmap-based implementations. (The MI
implementation takes advantage of the fact that objects in -CURRENT
are ordered collections of pages.) On sparc64, pmap_object_init_pt()
hadn't (yet) been implemented.


117126 01-Jul-2003 scottl

Mega busdma API commit.

Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.

sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.

If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.

Reviewed by: tmm, gibbs


117045 29-Jun-2003 alc

- Export pmap_enter_quick() to the MI VM. This will permit the
implementation of a largely MI pmap_object_init_pt() for vnode-backed
objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64
and powerpc.
- Correct a mismatch between pmap_object_init_pt()'s prototype and its
various implementations. (I plan to keep pmap_object_init_pt() as
the MD hook for device-backed objects on i386 and amd64.)
- Correct an error in ia64's pmap_enter_quick() and adjust its interface
to match the other versions. Discussed with: marcel


117017 29-Jun-2003 grehan

Allow the interrupt controller to be probed - this picks up the
Heathrow PIC, while not affecting the OpenPIC.


116965 28-Jun-2003 grehan

A module to handle the interrupt controller on Heathrow/Paddington
MacIO chips, found on older Mac G3's.


116964 28-Jun-2003 grehan

A module for the Motorola MPC106 system controller aka 'Grackle'
found on older Mac G3's.


116958 28-Jun-2003 davidxu

Add a machine depended function thread_siginfo, SA signal code
will use the function to construct a siginfo structure and use
the result to export to userland.

Reviewed by: julian


116907 27-Jun-2003 scottl

Do the first and mostly mechanical step of adding mutex support to the
bus_dma async callback scheme. Note that sparc64 does not seem to do
async callbacks. Note that ia64 callbacks might not be MPSAFE at the
moment. Note that powerpc doesn't seem to do async callbacks due to
the implementation being incomplete.

Reviewed by: mostly silence on arch@


116804 25-Jun-2003 grehan

Remove unused bootpath[] variable. It conflicted with a declaration
in the sunlabel utility, causing build problems.


116355 14-Jun-2003 alc

Migrate the thread stack management functions from the machine-dependent
to the machine-independent parts of the VM. At the same time, this
introduces vm object locking for the non-i386 platforms.

Two details:

1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set
KSTACK_GUARD_PAGES to 0.

2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().


116328 14-Jun-2003 alc

Move the *_new_altkstack() and *_dispose_altkstack() functions out of the
various pmap implementations into the machine-independent vm. They were
all identical.


116188 11-Jun-2003 peter

GC unused cpu_wait() function


115999 08-Jun-2003 jmallett

Note that scbus is required for SCSI, not just "required" in general.

Submitted by: Edward Kaplan (tmbg37 on IRC)
Reviewed by: rwatson (in principle)


115858 04-Jun-2003 marcel

Change the second (and last) argument of cpu_set_upcall(). Previously
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.

Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.

This change officially marks the ability on ia64 for 1:1 threading.

Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64


115614 01-Jun-2003 phk

Remove #include <sys/disklabel.h>


115343 27-May-2003 scottl

Bring back bus_dmasync_op_t. It is now a typedef to an int, though the
BUS_DMASYNC_ definitions remain as before. The does not change the ABI,
and reverts the API to be a bit more compatible and flexible. This has
survived a full 'make universe'.

Approved by: re (bmah)


115164 19-May-2003 kan

sys/sys/limits.h:

- Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)'
is alays wrong because __FOO_VISIBLE is always defined (to 0 for
invisibility).

sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:

- Style fixes.

Submitted by: bde
Reviewed by: bsdmike
Approved by: re (scottl)


114983 13-May-2003 jhb

- Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.

Reviewed by: arch@
Approved by: re (rwatson)


114727 05-May-2003 obrien

Things run thru the C preprocessor must use C-style comments.


114678 04-May-2003 kan

Style fixes.
Remove DBL_DIG, DBL_MIN, DBL_MAX and their FLT_ counterparts, they
were marked for deprecation ever since SUSv1 at least.
Only define ULLONG_MIN/MAX and LLONG_MAX if long long type is
supported.
Restore a lost comment in MI _limits.h file and remove it from
sys/limits.h where it does not belong.


114374 01-May-2003 peter

Back out last commits. The elf64/elf32 kernel name thing was more pain
than it was worth.


114373 01-May-2003 peter

Slight reorg and added AMD64 support. A couple of the MODINFOMD_* values
that were added to sparc64 and later powerpc, really should have been in
the MI area. But changing that now with insufficient preperation will
just cause too much pain.

Move MD_FETCH() to the MI sys/linker.h file to avoid another two copies
of it.


114342 30-Apr-2003 peter

Fix transcription error. Use == NULL, not != NULL. Fortunately this
was harmless.


114340 30-Apr-2003 peter

Look for an elf32 kernel (powerpc) and elf64 kernel (sparc64) as well
as a plain "elf kernel".


114305 30-Apr-2003 jhb

Range check the syscall number before looking it up in the syscallnames[]
array.

Submitted by: pho


114216 29-Apr-2003 kan

Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>


114029 25-Apr-2003 jhb

- Push down Giant into the sysarch() calls that still need Giant.
- Standardize on EINVAL rather than EOPNOTSUPP if the sysarch op value is
invalid.


113998 25-Apr-2003 deischen

Add an argument to get_mcontext() which specified whether the
syscall return values should be cleared. The system calls
getcontext() and swapcontext() want to return 0 on success
but these contexts can be switched to at a later time so
the return values need to be cleared in the saved register
sets. Other callers of get_mcontext() would normally want
the context without clearing the return values.

Remove the i386-specific context saving from the KSE code.
get_mcontext() is not i386-specific any more.

Fix a bad pointer in the alpha get_mcontext() code. The
context was being bcopy()'d from &td->tf_frame, but tf_frame
is itself a pointer, so the thread was being copied instead.
Spotted by jake.

Glanced at by: jake
Reviewed by: bde (months ago)


113941 23-Apr-2003 kan

Add a new sys/limits.h file which in turn depends on machine/_limits.h
to get actual constant values. This is in preparation for machine/limits.h
retirement.

Discussed on: standards@
Submitted by: Craig Rodrigues <rodrigc@attbi.com> (*)
Modified by: kan


113837 22-Apr-2003 simokawa

add scbus for FireWire.


113803 21-Apr-2003 simokawa

Add FireWire drivers to GENERIC.


113703 19-Apr-2003 grehan

Fix compile warning - proc should have been thread.


113650 18-Apr-2003 grehan

Remove reference to ata resource in print_child.


113649 18-Apr-2003 grehan

Remove sparse address hack.


113648 18-Apr-2003 grehan

Vastly simplify the macio ATA attachment, now that the register file
indirection is handled in the ATA common code.


113647 18-Apr-2003 grehan

Remove sparse addressing hack. The macio ATA driver no longer requires
this.


113646 18-Apr-2003 grehan

- Convert NetBSD-derived macros to inline functions for better
type-checking and future debug code.
- Remove sparse addressing hack, since the only consumer, the macio ATA
driver, doesn't require it anymore.


113350 10-Apr-2003 mux

I deserve a big pointy hat for having missed all those references
to bus_dmasync_op_t in my last commit.


113347 10-Apr-2003 mux

Change the operation parameter of bus_dmamap_sync() from an
enum to an int and redefine the BUS_DMASYNC_* constants as
flags. This allows us to specify several operations in one
call to bus_dmamap_sync() as in NetBSD.


113255 08-Apr-2003 des

Introduce an M_ASSERTPKTHDR() macro which performs the very common task
of asserting that an mbuf has a packet header. Use it instead of hand-
rolled versions wherever applicable.

Submitted by: Hiten Pandya <hiten@unixdaemons.com>


113090 04-Apr-2003 des

Define ovbcopy() as a macro which expands to the equivalent bcopy() call,
to take care of the KAME IPv6 code which needs ovbcopy() because NetBSD's
bcopy() doesn't handle overlap like ours.

Remove all implementations of ovbcopy().

Previously, bzero was a function pointer on i386, to save a jmp to
bzero_vector. Get rid of this microoptimization as it only confuses
things, adds machine-dependent code to an MD header, and doesn't really
save all that much.

This commit does not add my pagezero() / pagecopy() code.


113038 03-Apr-2003 obrien

Use __FBSDID rather than rcsid[].


112898 01-Apr-2003 jeff

- Define a new md function 'casuptr'. This atomically compares and sets
a pointer that is in user space. It will be used as the basic primitive
for a kernel supported user space lock implementation.
- Implement this function in x86's support.s
- Provide stubs that return -1 in all other architectures. Implementations
will follow along shortly.

Reviewed by: jake


112888 31-Mar-2003 jeff

- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
- signotify() now operates on a thread since unmasked pending signals are
stored in the thread.
- PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.


112883 31-Mar-2003 jeff

- Change trapsignal() to accept a thread and not a proc.
- Change all consumers to pass in a thread.

Right now this does not cause any functional changes but it will be important
later when signals can be delivered to specific threads.


112569 25-Mar-2003 jake

- Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
with PAE.
- Use this to represent physical addresses in the MI vm system and in the
i386 pmap code. This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by: DARPA, Network Associates Laboratories
Discussed with: re, phk (cdevsw change)


112498 22-Mar-2003 ru

Remove bitrot associated with `maxusers'.

Submitted by: bde


112436 20-Mar-2003 mux

Use atomic operations to increment and decrement the refcount
in busdma tags. There are currently no tags shared accross
different drivers so this isn't needed at the moment, but it
will be required when we'll have a proper newbus method to get
the parent busdma tag.


112429 20-Mar-2003 grehan

Enable the FPU on first use per-thread and save state across context
switches. Not as lazy as it could be. Changing FPU state with sigcontext
still TODO.

fpu.c - convert some asm to inline C, and macroize fpu loads/stores
swtch.S - call out to save/restore fpu routines
trap.c - always call enable_fpu, since this shouldn't be called once
the FPU has been enabled for a thread
genassym.c - define for pcb fpu flag


112428 20-Mar-2003 grehan

- Add PCI ID for Paddington i/o controller, used in old G3's
- Add ID for the Intrepid i/o controller, used in new 12"/17" PowerBooks
- put IDs in chronological order


112402 19-Mar-2003 grehan

Add machine check handler. While generally useful, it's required when
issuing PCI config cycles on MPC106-based PowerMacs, which cause machine
checks when accessing non-existent/empty slots.


112312 16-Mar-2003 jake

Made the prototypes for pmap_kenter and pmap_kremove MD. These functions
are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.


112196 13-Mar-2003 mux

Grab Giant around calls to contigmalloc() and contigfree() so
that drivers converted to be MP safe don't have to deal with it.


111883 04-Mar-2003 jhb

Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls
to WITNESS_WARN().


111814 03-Mar-2003 grehan

Simplify ofw_pci_fixup(). It doesn't need to be recursive, since the
bridge code already handles IRQ adjustment on the far side of a bridge.

Reviewed by: benno


111655 28-Feb-2003 benno

These files are no longer used. They have been replaced with similarly named
.S files.


111551 26-Feb-2003 grehan

Register typo and incorrect 32-bit constant load in previous commit.
Resulted in AST delivery not working.


111524 26-Feb-2003 mux

Correctly set BUS_SPACE_MAXSIZE in all the busdma backends.
It was bogusly set to 64 * 1024 or 128 * 1024 because it was
bogusly reused in the BUS_DMAMAP_NSEGS definition.


111462 25-Feb-2003 mux

Cleanup of the d_mmap_t interface.

- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.

I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.

Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386


111404 24-Feb-2003 grehan

Catch up with ATAng changes


111316 23-Feb-2003 grehan

Doh. Forgot to remove _KERNEL version.


111267 22-Feb-2003 grehan

Expose powerpc_mb() to user-space. Currently needed for atomic.h users,
this may go away in the future.


111156 20-Feb-2003 grehan

Adjust IRQ count for psim's OpenPIC model - it seems to be
off by 1.


111155 20-Feb-2003 grehan

Catch up to latest KSE changes


111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


111032 17-Feb-2003 julian

Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@


111028 17-Feb-2003 jeff

- Split the struct kse into struct upcall and struct kse. struct kse will
soon be visible only to schedulers. This greatly simplifies much the
KSE code.

Submitted by: davidxu


111024 17-Feb-2003 jeff

- Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into
the proc. These counters are only examined through calcru.

Submitted by: davidxu
Tested on: x86, alpha, UP/SMP


111002 16-Feb-2003 phk

Remove #include <sys/dkstat.h>


110832 13-Feb-2003 obrien

Fix whitespace problems with option lines.


110831 13-Feb-2003 obrien

Fix the style of the SCHED_4BSD commit.


110786 13-Feb-2003 grehan

Missed odd address test when transcribing the Alpha version.
This fixes the checksum problems seen with telnet.


110566 08-Feb-2003 mike

Implement fpclassify():
o Add a MD header private to libc called _fpmath.h; this header
contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
<math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
<machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
on the size of its argument. __fpclassifyl() is never called on
alpha because (sizeof(long double) == sizeof(double)), which is good
since __fpclassifyl() can't deal with such a small `long double'.

This was developed by David Schultz and myself with input from bde and
fenner.

PR: 23103
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
(significant portions)
Reviewed by: bde, fenner (earlier versions)


110441 06-Feb-2003 benno

Oops. Include opt_ddb.h.


110439 06-Feb-2003 benno

Add a driver that attaches to the gpio node of macio and allows you to enter
DDB when the interrupt button (aka the "programmer's switch") is pressed.

This isn't unfortunately an NMI, but it's a handy way to get into DDB
quickly if needed.


110437 06-Feb-2003 benno

Add a cast to silence a warning.


110436 06-Feb-2003 benno

If a device tries to allocate an interrupt that's not on it's resource list,
assume that the child knows what it's doing and add it to the resource list.


110389 05-Feb-2003 benno

GC an unused variable.


110388 05-Feb-2003 benno

Export the ns_per_tick variable through md_var.h rather than by declaring
it extern in cpu.c.


110387 05-Feb-2003 benno

- Use cpu_setup() instead of identifycpu().
- Remove identifycpu().


110386 05-Feb-2003 benno

Add cpu.c. This contains one exported function, cpu_setup(), which handles
setup of and printing information about cpus.

Obtained from: NetBSD (parts)


110385 05-Feb-2003 benno

- Update spr.h
- Add hid.h

Obtained from: NetBSD

NOTE: This undoes some changes I'd made to prefix the processor name defines
with PVR_. This was due to my original decision to use MPC750 as a cpu name.
With this changed, the PVR_ change is no longer required.


110384 05-Feb-2003 benno

Add an inline function wrapper for the mfpvf (Move From Processor Version
Register) instruction.


110383 05-Feb-2003 benno

Not all cpus are MPC750s. Replace the MPC750 cpu option with OEA. This
stands for Operating Environment Architecture and is the specification that
all of the MPC6xx, MPC7xx, MPC7xxx and IBM7xx CPUs adhere to.


110381 05-Feb-2003 benno

Replace the inline asm in delay() with a while loop. This may not be as
efficient but it appears to actually work. Some investigation may be
required.


110380 05-Feb-2003 benno

- Rename the "powerpc" timecounter to the "decrementer" timecounter.
- Initialise it earlier.


110335 04-Feb-2003 harti

Fix a problem in bus_dmamap_load_{mbuf,uio} when the first mbuf or the first
uio segment is empty. In this case no dma segment is create by
bus_dmamap_load_buffer, but the calling routine clears the first flag.
Under certain combinations of addresses of the first and second mbuf/uio
buffer this leads to corrupted DMA segment descriptors. This was already
fixed by tmm in sparc64/sparc64/iommu.c.

PR: kern/47733
Reviewed by: sam
Approved by: jake (mentor)


110296 03-Feb-2003 jake

Split statclock into statclock and profclock, and made the method for driving
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.

Reviewed by: jhb, tmm
Tested on: i386, sparc64


110223 02-Feb-2003 benno

Add device zs to GENERIC on powerpc.


110202 01-Feb-2003 joe

Put replace spaces with tabs in keeping with the rest of the file.


110190 01-Feb-2003 julian

Reversion of commit by Davidxu plus fixes since applied.

I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.


110180 01-Feb-2003 benno

- Introduce a flags value into the interrupt handler structure.
- Copy the flags passed to inthand_add into the flags value.
- If the interrupt is INTR_FAST, re-enable the irq after running the handler.


110172 01-Feb-2003 grehan

- add pmap_pagedaemon_waken variable
- remove dead code and fix warnings in pmap_zero_page/zero_page_area
- implement
pmap_clear_reference
pmap_ts_referenced
pmap_page_exists_quick
pmap_remove_all
- align pmap_qenter/qremove closer with i386 code
- fix vm_page locking in pmap_new_thread (from benno)
- add new parameter to pmap_clear_bit to return original
pte value

Approved by: benno


110167 01-Feb-2003 benno

Make nirq mean 'number of irqs' and not 'last irq'.


110080 30-Jan-2003 benno

Rework of how memory resources are discovered and dealt with in macio.
- Store the OpenFirmware "reg" property in the macio ivars.
- Use a struct to define the structure of a "reg" property entry.
- Discover all memory ranges, not just the first.
- In ata_macio, manage our own range and hand out our own allocations using
bus_space_subregion.
- Fix bus_space_subregion to handle subregions of sparse maps.


109975 28-Jan-2003 benno

Put the right fix in. Instead of deleting the declaration of __FBSDID, we
undef it before this definition.


109935 27-Jan-2003 benno

Back the previous commit out. It didn't actually fix the problem I was
seeing and the memory barrier isn't needed with the bridges we're using.

Fix the function style however.


109920 27-Jan-2003 benno

Back out some changes that snuck in with the last commit.

Pointy hat to: benno


109919 27-Jan-2003 benno

Flesh out bus_dmamap_sync.


109918 27-Jan-2003 benno

Use td->td_sticks, not td->td_kse->ke_sticks.

Forgotten by: davidxu


109917 27-Jan-2003 benno

Remove a duplicate definition of the __FBSDID macro.


109877 26-Jan-2003 davidxu

Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian


109865 26-Jan-2003 jeff

- Introduce the SCHED_ULE and SCHED_4BSD options for compile time selection
of the scheduler.
- Add SCHED_4BSD as the scheduler for all kernel config files in cvs.


109677 22-Jan-2003 grehan

Remove BAT invalidation. This is done later in the boot sequence,
so isn't required here, and seems to cause problems when booting
from disk.

Approved by: benno


109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


109605 21-Jan-2003 jake

Resolve relative relocations in klds before trying to parse the module's
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.

Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64


109602 20-Jan-2003 gallatin

include cdefs.h so as to unbreak the libc build


109483 18-Jan-2003 grehan

Removed unnecessary includes and brought up to date with ata
common code by adding lock functions.


109481 18-Jan-2003 grehan

Stub profile.h, required for userland builds.

Approved by: Benno


109480 18-Jan-2003 grehan

<machine/ieee.h>, taken from sparc64

Approved by: Benno


109479 18-Jan-2003 grehan

Fix bugs with operand ordering and unnecessary sync/eieio ops. Mostly
obtained from Alpha atomic.h

Approved by: Benno


109478 18-Jan-2003 grehan

Allow the MD frame definition to be seen in. Required for truss/ptrace.

Approved by: Benno


109477 18-Jan-2003 grehan

RAIDframe requires LONG_BIT

Approved by: Benno


109476 18-Jan-2003 grehan

Prepended underscores to macro local vars, avoiding gcc "declaration
shadows global" warning

Approved by: benno


109475 18-Jan-2003 grehan

Change definition of int64 to avoid gcc3.2.1 complaints. Taken from i386

Approved by: benno


109342 16-Jan-2003 dillon

Merge all the various copies of vm_fault_quick() into a single
portable copy.


109340 15-Jan-2003 dillon

Merge all the various copies of vmapbuf() and vunmapbuf() into a single
portable copy. Note that pmap_extract() must be used instead of
pmap_kextract().

This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).


109156 13-Jan-2003 benno

Correct an off-by-one error in the calculation of the number of interrupt
resources we're managing.


109008 09-Jan-2003 benno

Make ofw_pci_find_node() use the reg property instead of the
assigned-addresses property. This works a lot better.


109001 09-Jan-2003 benno

Add a pcib variant to allow us to fix up interrupt assignments.

We probably want to do something wrt bus enumeration as well at some point.


108994 09-Jan-2003 benno

Allocate interrupts from the resource list.


108991 09-Jan-2003 benno

- Remove the ignore list and replace it with a quirk list of sorts.
- Add a quirk type for devices whose interrupt properties are actually
attached to their children.
- Flag the "escc" (zs-alike serial controller) device as having this quirk.
- Rework the interrupt discovery code to deal with devices that have more than
one interrupt.


108981 09-Jan-2003 grehan

- remove unneeded includes
- fix big in use of rid for SYS_RES_IRQ
- catch up with ATA common code by adding lock function


108979 09-Jan-2003 grehan

Remove obsolete GEOM option, and bring diskless options up-to-date

Approved by: benno


108945 08-Jan-2003 grehan

Add page queues locking to vunmapbuf().

Obtained from: sparc64
Approved by: benno


108944 08-Jan-2003 grehan

Sync the i-cache after copying down the interrupt code

Approved by: benno


108943 08-Jan-2003 grehan

Be more conservative about re-enabling interrupts during trap processing
until atomic issues are fully sorted.

Approved by: benno


108942 08-Jan-2003 grehan

Fix incorrect error returns and sign-extension.

Approved by: benno


108941 08-Jan-2003 grehan

Fetch the initial time from the rtc OpenFirmware node. This is a short-term
measure until the rtc h/w driver is written, and it's a lot better
than having "jan 1 1970" on filesys times.

Approved by: benno


108940 08-Jan-2003 grehan

Remove obsolete NFS_ROOT conditional.

Approved by: benno


108939 08-Jan-2003 grehan

Implement bus_dmamap_load_mbuf/bus_dmamap_load_uio.
Tested load_mbuf with GEM ethernet driver.

Submitted by: Hiten Pandya <hiten@unixdaemons.com> (modified by grehan)


108938 08-Jan-2003 grehan

- define HAS_STREAM_METHODS correctly
- dmamap_load_mbuf/load_uio prototypes

Submitted partly by: Hiten Pandya <hiten@unixdaemons.com>


108533 01-Jan-2003 schweikh

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


108175 22-Dec-2002 tjr

MB_LEN_MAX is not MD, move it to the MI limits.h.


107719 10-Dec-2002 julian

Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage
during the context switch. Rearrange thread cleanups
to avoid problems with Giant. Clean threads when freed or
when recycled.

Approved by: re (jhb)


107180 22-Nov-2002 mux

Under certain circumstances, we were calling kmem_free() from
i386 cpu_thread_exit(). This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit(). We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().

Approved by: re@
Suggested by: jhb


106977 16-Nov-2002 deischen

Add getcontext, setcontext, and swapcontext as system calls.
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.

A few style nits and comments from bde are also included.

Tested on alpha by: gallatin


106838 13-Nov-2002 alc

Move pmap_collect() out of the machine-dependent code, rename it
to reflect its new location, and add page queue and flag locking.

Notes: (1) alpha, i386, and ia64 had identical implementations
of pmap_collect() in terms of machine-independent interfaces;
(2) sparc64 doesn't require it; (3) powerpc had it as a TODO.


106697 09-Nov-2002 des

Print real / avail memory in megabytes rather than kilobytes.


106605 07-Nov-2002 tmm

Move the definitions of the hw.physmem, hw.usermem and hw.availpages
sysctls to MI code; this reduces code duplication and makes all of them
available on sparc64, and the latter two on powerpc.
The semantics by the i386 and pc98 hw.availpages is slightly changed:
previously, holes between ranges of available pages would be included,
while they are excluded now. The new behaviour should be more correct
and brings i386 in line with the other architectures.

Move physmem to vm/vm_init.c, where this variable is used in MI code.


106503 06-Nov-2002 jmallett

Remove what was a temporary bogus assignment of bits of siginfo_t, as it does
not look like the prerequisites to fill it in properly will be in the tree
for the upcoming release, but it's mostly done, so there is no need for these
to stay around to remind us.


105950 25-Oct-2002 peter

Split 4.x and 5.x signal handling so that we can keep 4.x signal
handling clean and functional as 5.x evolves. This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.

Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43. Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.

Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago).
Approved by: re


105611 21-Oct-2002 grehan

Add the USER_SR segment register to pcb state. Initialize correctly,
and save/restore during a context switch.

The USER_SR could be overwritten when the current thread was switched
out with a faulting copyin/copyout.

Approved by: Benno


105469 19-Oct-2002 marcel

Add two hooks to signal module load and module unload to MD code.
The primary reason for this is to allow MD code to process machine
specific attributes, segments or sections in the ELF file and
update machine specific state accordingly. An immediate use of this
is in the ia64 port where unwind information is updated to allow
debugging and tracing in/across modules. Note that this commit
does not add the functionality to the ia64 port. See revision 1.9
of ia64/ia64/elf_machdep.c.

Validated on: alpha, i386, ia64


105463 19-Oct-2002 rwatson

Permits UFS ACLs to be used with the GENERIC kernel. Due to recent
ACL configuration changes, this shouldn't result in different code paths
for file systems not explicitly configured for ACLs by the system
administrator. For UFS1, administrators must still recompile their
kernel to add support for extended attributes; for UFS2, it's sufficient
to enable ACLs using tunefs or at mount-time (tunefs preferred for
reliability reasons). UFS2, for a variety of reasons, including
performance and reliability, is the preferred file system for use with
ACLs.

Approved by: re


105138 15-Oct-2002 peter

The a.out md_coredump stuff isn't referenced anywhere anymore, and
hasn't been filled in for ages.. Nuked.


105054 13-Oct-2002 mike

Remove the P1003_1B kernel option; it is no longer used.


105014 13-Oct-2002 mike

Add standards visibility conditionals. Change any uses of sigset_t to
struct __sigset to avoid depending on objects from <sys/signal.h>.


104584 06-Oct-2002 mike

Add conditionals to allow va_list to be defined in other headers.


104583 06-Oct-2002 mike

o Add conditionals to allow va_list to be defined in other headers.
o Standardize on _MACHINE_STDARG_H_ to allow multiple header includes.
o Restrict the definition of va_copy() to C99 environments.


104567 06-Oct-2002 grehan

Roll back to previous version, no need for NO_GEOM when GEOM is
standard.


104519 05-Oct-2002 phk

NB: This commit does *NOT* make GEOM the default in FreeBSD
NB: But it will enable it in all kernels not having options "NO_GEOM"

Put the GEOM related options into the intended order.

Add "options NO_GEOM" to all kernel configs apart from NOTES.

In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.

There are currently three known issues which may force people to
need the NO_GEOM option:

boot0cfg/fdisk:
Tries to update the MBR while it is being used to control
slices. GEOM does not allow this as a direct operation.

SCSI floppy drives:
Appearantly the scsi-da driver return "EBUSY" if no media
is inserted. This is wrong, it should return ENXIO.

PC98:
It is unclear if GEOM correctly recognizes all variants of
PC98 disklabels. (Help Wanted! I have neither docs nor HW)

These issues are all being worked.

Sponsored by: DARPA & NAI Labs.


104505 05-Oct-2002 mike

Fix namespace issues by using visibility conditionals from
<sys/cdefs.h>.


104498 05-Oct-2002 jmallett

Define _MACHINE.


104493 04-Oct-2002 mike

style(9) <machine/setjmp.h> headers so they look mostly the same.


104435 04-Oct-2002 grehan

Clean up ddb warnings/errors and enable in GENERIC

Approved by: benno
Motivated by: gallatin


104434 04-Oct-2002 grehan

- fix zero-sized stack alloc from previous commit. a default is now
selected ala sparc64
- KSEIII routines implemented (taken from i386/sparc64)

Approved by: Benno


104354 02-Oct-2002 scottl

Some kernel threads try to do significant work, and the default KSTACK_PAGES
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create. Passing the
value 0 prevents the alternate kstack from being created. Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.

Reviewed by: jake, peter, jhb


103850 23-Sep-2002 peter

PIC_GOTOFF is OBE.


103814 23-Sep-2002 mike

Be careful not to define GCC-specific optimizations in the non-GCC
case.


103771 22-Sep-2002 benno

It's Apple GMAC, not HMAC.

Approved by: jake (for sparc64)


103646 19-Sep-2002 jhb

Implement db_print_backtrace() if DDB is compiled into the kernel. This
MD function is just a wrapper around db_stack_trace_cmd() that prints out
a backtrace of curthread. Currently, this function is only implemented
on i386 and alpha (and the alpha version isn't quite tested yet, will do
that in a bit). Other changes:

- For i386, fix a bug in the raw frame address case. The eip we extract
from the passed in frame address does not match the frame we received.
Thus, instead of printing a bogus frame with the wrong eip, go ahead
and advance frame down to the same frame as the eip we are using.
- For alpha, attempt to add a way of doing a raw trace for alpha. Instead
of passing a frame address in 'addr', pass in a pointer to a structure
containing PC and KSP and use those to start the backtrace. The alpha
db_print_backtrace() uses asm to read in the current PC and KSP values
into such a request.

Tested on: i386
Requested by: many


103629 19-Sep-2002 grehan

Updated to somewhat match sparc64/conf/GENERIC

Approved by: benno


103620 19-Sep-2002 grehan

Support files and a h/w tree description for the PSIM ppc simulator

Approved by: benno


103619 19-Sep-2002 grehan

Driver for the macio south bridge, and ATA cell contained within.

Approved by: benno


103618 19-Sep-2002 grehan

softc and register defs for the UniNorth chip

Approved by: benno


103617 19-Sep-2002 grehan

- probe the UniNorth chip in addition to the PCI bridges
- enable GEM ethernet cell if present
- allow sparse address mapping for devices

Approved by: benno


103616 19-Sep-2002 grehan

Removed osigframe. No need for COMPAT_43 signal bin-compat in PPC.

Approved by: benno


103615 19-Sep-2002 grehan

psim device support

Approved by: benno


103614 19-Sep-2002 grehan

<machine/types.> -> <sys/types.h>

Approved by: benno


103613 19-Sep-2002 grehan

Fix clearing of recoverable exception MSR bit when disabling
interrupts

Approved by: benno


103612 19-Sep-2002 grehan

Additional machdep sysctl constants needed for userland utils

Approved by: benno


103611 19-Sep-2002 grehan

Added sparse address support, required by the macio ATA device

Approved by: benno


103610 19-Sep-2002 grehan

Fixed branch labels

Approved by: benno


103609 19-Sep-2002 grehan

- bring vm_mapbuf/unmapbuf in line with other archs
- update for recent KSE changes

Approved by: benno


103608 19-Sep-2002 grehan

- make sure recoverable interrupts are re-enabled in the trap handler
- turn on ast() loop to enable signal delivery

Approved by: benno


103607 19-Sep-2002 grehan

- worked around 32-bit big-endian syscall return value problem
- syscall register spills weren't copied in correctly
- removed VM_PROT_READ from the fault type on write protect faults

Approved by: benno


103606 19-Sep-2002 grehan

Add sync before isync for G4 cpus

Obtained from: NetBSD
Approved by: benno


103605 19-Sep-2002 grehan

- use symbol for user-context offset
- fix szsigcode size declaration

Approved by: benno


103604 19-Sep-2002 grehan

- use BAT registers to map device space and physical memory
- remove test in pmap_activate that prevented vmspace sharing (v/rfork)
- always sync icache in pmap_enter until problems are sorted
- fix incorrect use of regions in pmap_kenter
- bring in pmap_release from NetBSD
- fix overwrite of bootstrap flag in pmap_pvo_enter

Approved by: benno


103603 19-Sep-2002 grehan

- psim device support
- comment out re-enabling of interrupts until problems are sorted

Approved by: benno


103602 19-Sep-2002 grehan

Clear on-demand BAT entries to properly restore OpenFirmware's
address space

Approved by: benno


103601 19-Sep-2002 grehan

psim device support

Approved by: benno


103600 19-Sep-2002 grehan

- implemented sendsig/sigreturn
- sysctl for cacheline size, required by libc/rtld
- init'd more exception vectors
- fixed problem with register overwrite in exec_setregs
- removed redundant NetBSD code

Approved by: benno


103599 19-Sep-2002 grehan

- moved intrcnt/intrnames to locore.s to fix sysctl -a panic

Approved by: benno


103598 19-Sep-2002 grehan

- rationalised includes
- added sigframe offset

Approved by: benno


103597 19-Sep-2002 grehan

- removed unnecessary includes
- converted inline asm to C for int enable
- shifted clearing of 'cold' to end of routine

Approved by: benno


103526 18-Sep-2002 mike

Implement C99's va_copy() macro.


103436 17-Sep-2002 peter

Initiate deorbit burn for the i386-only a.out related support. Moves are
under way to move the remnants of the a.out toolchain to ports. As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL. It has already lasted WAY beyond
that.

Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.

Tested on: i386 (extensively), alpha


103367 15-Sep-2002 julian

Allocate KSEs and KSEGRPs separatly and remove them from the proc structure.
next step is to allow > 1 to be allocated per process. This would give
multi-processor threads. (when the rest of the infrastructure is
in place)

While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc
are diverging more than they should.. corrective action needed soon.


103109 09-Sep-2002 kuriyama

Use "options " rather than "options<tab>".


103049 07-Sep-2002 peter

Zap the implementations of the i386-aout specific cpu_coredump function.
Most of the non-i386 platforms had rather broken implementations anyway.


102874 03-Sep-2002 mike

Now that _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_ are the same on all
architectures, move the definition directly into <time.h> and finish
the removal of <machine/ansi.h>.


102808 01-Sep-2002 jake

Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections to
sysentvec. Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places. Provided stack
fixup routines for emulations that previously used the default.


102666 31-Aug-2002 peter

Take a shot at fixing up a whole stack of style and other embarresing
unforced errors that Bruce identified. I have not yet addressed all of
his concerns.


102600 30-Aug-2002 peter

Change hw.physmem and hw.usermem to unsigned long like they used to be
in the original hardwired sysctl implementation.

The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int. Use
'long' there.

Change Maxmem and physmem and related variables to 'long', mostly for
completeness. Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody. This
comes for free on 32 bit machines, so why not?


102561 29-Aug-2002 jake

Renamed poorly named setregs to exec_setregs. Moved its prototype to
imgact.h with the other exec support functions.


102429 26-Aug-2002 mike

Since arm and powerpc aren't far enough to set stathz, take a
preemptive strike and change _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_
to 128.

Approved by: benno


102399 25-Aug-2002 alc

o Retire pmap_pageable(). It's an advisory routine that none
of our platforms implements.


102315 23-Aug-2002 mike

Move several MI types from <machine/_types.h> to <sys/_types.h>.
These types are unlikely to ever become very MD. They include:
clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t,
rune_t, socklen_t, timer_t, wchar_t, and wint_t.

While moving them, make a few adjustments (submitted by bde):
o __ct_rune_t needs to be precisely `int', not necessarily __int32_t,
since the arg type of the ctype functions is int.
o __rune_t, __wchar_t and __wint_t inherit this via a typedef of
__ct_rune_t.
o Some minor wording changes in the comment blocks for ct_rune_t and
mbstate_t.

Submitted by: bde (partially)


102227 21-Aug-2002 mike

o Merge <machine/ansi.h> and <machine/types.h> into a new header
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif

Concept by: bde
Reviewed by: jake, obrien


101941 15-Aug-2002 rwatson

In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
"cred", and change the semantics of consumers of fo_read() and
fo_write() to pass the active credential of the thread requesting
an operation rather than the cached file cred. The cached file
cred is still available in fo_read() and fo_write() consumers
via fp->f_cred. These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
pipe_read/write() now authorize MAC using active_cred rather
than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred. Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not. If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


101483 07-Aug-2002 alc

o Introduce pmap_page_is_mapped(). Its purpose is to obsolete
the PG_MAPPED flag.


101346 05-Aug-2002 alc

o Don't set PG_MAPPED or PG_WRITEABLE when a page is mapped
using pmap_kenter() or pmap_qenter().
o Use VM_ALLOC_WIRED in pmap_new_thread().


101165 01-Aug-2002 blackend

Fix the link to the Handbook


100882 29-Jul-2002 mike

Create a new header <machine/_stdint.h> for storing MD parts of
<stdint.h>. Previously, parts were defined in <machine/ansi.h> and
<machine/limits.h>. This resulted in two problems:
(1) Defining macros in <machine/ansi.h> gets in the way of that
header only defining types.
(2) Defining C99 limits in <machine/limits.h> adds pollution to
<limits.h>.


100551 23-Jul-2002 peter

de-count pci


100476 22-Jul-2002 peter

No more NO_WERROR for the kernel. It's still possible though, but
seperate from NO_WERROR which is easily mixed up with in userland.


100464 21-Jul-2002 peter

Add explicit unit count on 'device pci' for ahc/ahd


100384 20-Jul-2002 peter

Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.

Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.

Obtained from: dfr (mostly, many tweaks from me).


100319 18-Jul-2002 benno

Remove the statically allocated array that holds OpenFirmware memory mappings
during pmap_bootstrap. Instead, temporarily help ourselves to some memory
from phys_avail since we won't need it post-boostrap.


100189 16-Jul-2002 jhb

Various comment and minor style fixes. No actual content changes.

Inspired by: bde


99900 13-Jul-2002 mini

Add additional cred_free_thread() calls that I had missed the first time.

Pointed out by: jhb


99887 12-Jul-2002 jhb

Set the thread state of the newly chosen to run thread to TDS_RUNNING in
choosethread() in MI C code instead of doing it in in assembly in all the
various cpu_switch() functions. This fixes problems on ia64 and sparc64.

Reviewed by: julian, peter, benno
Tested on: i386, alpha, sparc64


99733 10-Jul-2002 mike

Remove label_t and physadr, which seem to have never been used in
FreeBSD.

Submitted by: bde


99731 10-Jul-2002 benno

Add setjmp (needed for DDB).


99730 10-Jul-2002 benno

Add DDB support.


99729 10-Jul-2002 benno

- Make sure we don't trample our metadata pointer in our initial bootstrap.
- Load metadata parameters.


99728 10-Jul-2002 benno

Metadata definitions.


99725 10-Jul-2002 benno

Remove some diagnostic code that snuck in.


99724 10-Jul-2002 benno

Remove some dead code.


99723 10-Jul-2002 benno

Remove some unused includes.


99667 09-Jul-2002 benno

Bring this in line with what I'm using.


99666 09-Jul-2002 benno

Add an implementation for pmap_zero_page_area.


99665 09-Jul-2002 benno

Add the OF_getetheraddr function required by if_gem.


99664 09-Jul-2002 benno

Tidy up trap vector and external interrupt setup.


99663 09-Jul-2002 benno

Driver for the Apple UniNorth Host-PCI bridge.

This is in a PowerMac-specific subdirectory as it is hoped that we will support
more than just the PowerMac platform.


99661 09-Jul-2002 benno

OpenFirmware PCI support code.

This and the sparc64 equivalent should probably be merged at some point.


99659 09-Jul-2002 benno

Changes for KSE3.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99658 09-Jul-2002 benno

Add this file, which I forgot in a previous commit.

This relates to the trap/interrupt cleanup.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99657 09-Jul-2002 benno

1) Add busdma machdep code.
2) Add bus_pio.h and bus_memio.h (which do nothing).

Submitted by: Peter Grehan <peterg@ptree32.com.au> (1)


99654 09-Jul-2002 benno

Driver for OpenPIC compatible interrupt controllers.
It's fairly PowerMac specific at the moment, but that should be fixable.


99652 09-Jul-2002 benno

- Add the "compatible" property to the list that we keep in ivars.
- Add interrupt alloc/setup/teardown/dealloc support, via whichever PIC
OpenFirmware gives us.


99651 09-Jul-2002 benno

Add interrupt handling support code.

I've tried to make this fairly platform-independant as some PowerPC platforms
may not have openpic-style interrupt controllers. This may not have the best
performance but it works for now.


99594 08-Jul-2002 mike

Move __offsetof() macro from <machine/ansi.h> to <sys/cdefs.h>. It's
hardly MD, since all our platforms share the same macro. It's not
really compiler dependent either, but this helps in reducing
<machine/ansi.h> to only type definitions.


99571 08-Jul-2002 peter

Add a special page zero entry point intended to be called via the single
threaded VM pagezero kthread outside of Giant. For some platforms, this
is really easy since it can just use the direct mapped region. For others,
IPI sending is involved or there are other issues, so grab Giant when
needed.

We still have preemption issues to deal with, but Alan Cox has an
interesting suggestion on how to minimize the problem on x86.

Use Luigi's hack for preserving the (lack of) priority.

Turn the idle zeroing back on since it can now actually do something useful
outside of Giant in many cases.


99559 07-Jul-2002 peter

Collect all the (now equivalent) pmap_new_proc/pmap_dispose_proc/
pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code
and use a single equivalent MI version. There are other cleanups
needed still.

While here, use the UMA zone hooks to keep a cache of preinitialized
proc structures handy, just like the thread system does. This eliminates
one dependency on 'struct proc' being persistent even after being freed.
There are some comments about things that can be factored out into
ctor/dtor functions if it is worth it. For now they are mostly just
doing statistics to get a feel of how it is working.


99557 07-Jul-2002 peter

Update for post-kse3 pmap kthread allocation changes


99117 30-Jun-2002 mike

Since printf(3) now supports the `j' conversion specifier, use that
when printing intmax_t and uintmax_t.

Forgotten by: mike
Noticed by: bde


99043 29-Jun-2002 benno

Add an inline to call eieio.

("Enforce In-order Execution of I/O". I am not making this up.)


99042 29-Jun-2002 benno

We don't need to clear RI in the MSR when entering a critical section.


99040 29-Jun-2002 benno

in_cksum et al.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99039 29-Jun-2002 benno

Implement vtophys()


99038 29-Jun-2002 benno

Add pmap_mapdev and pmap_unmapdev.


99037 29-Jun-2002 benno

- Initialise battable to cover I/O spaces.
- Statically size the bpvo entries to avoid conflicts between bpvo allocation
and the vm allocator.
- Shift pmap_init2 code into pmap_init.
- Add UMA_ZONE_VM flag to uma_zcreate.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99036 29-Jun-2002 benno

To quote Peter:

The case in cpu_switch() where there isn't a higher priority thread
(choosethread() == curthread) uses r4 as the PCB context pointer. However, the
use of r4 after the label L2 is incorrect, since it was probably trashed by
the call to choosethread, and in any case was set up to curthread at the start
of the routine.

This condition will occur when an interrupt thread schedules a netisr, which
is a lower priority thread.

Another (probably unnecessary) difference is that I was paranoid about
register trashing, so I decided to save r2 and r13 as well.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99035 29-Jun-2002 benno

mempcy/bcopy handles overlapping copies so make ovbcopy call it.


99034 29-Jun-2002 benno

Add BOOTP_NFSROOT support code.


99033 29-Jun-2002 benno

- Use tmpstk exclusively in the init path.
- Remove redundant code.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99032 29-Jun-2002 benno

Many fixes to low-level trap and interrupt handling:

- Tidy up clock code. Don't repeatedly call hardclock().
- Remove intrnames, decrnest and intrcnt from locore.s
- Coalesce all trap handling into a single stub that then calls a dispatch
function.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


99030 29-Jun-2002 benno

Convert this from mostly inline assembler to mostly C.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


98765 24-Jun-2002 jake

Add an MD callout like cpu_exit, but which is called after sched_lock is
obtained, when all other scheduling activity is suspended. This is needed
on sparc64 to deactivate the vmspace of the exiting process on all cpus.
Otherwise if another unrelated process gets the exact same vmspace structure
allocated to it (same address), its address space will not be activated
properly. This seems to fix some spontaneous signal 11 problems with smp
on sparc64.


98727 24-Jun-2002 mini

Remove unused diagnostic function cread_free_thread().

Approved by: alfred


98710 23-Jun-2002 iedowse

Make vm_pindex_t 64-bit on all platforms. This is necessary to avoid
overflows with the large file sizes that UFS2 permits.

Reviewed by: dillon, alc, tegge


98480 20-Jun-2002 peter

Deorbit suibyte(). It was only used for split address space systems
for supporting UIO_USERISPACE (ie: it wasn't used).


98469 20-Jun-2002 peter

Move the "- 1" into the RQB_FFS(mask) macro itself so that
implementations can provide a base zero ffs function if they wish.
This changes
#define RQB_FFS(mask) (ffs64(mask))
foo = RQB_FFS(mask) - 1;
to
#define RQB_FFS(mask) (ffs64(mask) - 1)
foo = RQB_FFS(mask);
On some platforms we can get the "- 1" for free, eg: those that use the
C code for ffs64().

Reviewed by: jake (in principle)


98001 07-Jun-2002 jhb

- Fixup / remove obsolete comments.
- ktrace no longer requires Giant so do ktrace syscall events before and
after acquiring and releasing Giant, respectively.
- For i386, ia32 syscalls on ia64, powerpc, and sparc64, get rid of the
goto bad hack and instead use the model on ia64 and alpha were we
skip the actual syscall invocation if error != 0. This fixes a bug
where if we the copyin() of the arguments failed for a syscall that
was not marked MP safe, we would try to release Giant when we had
not acquired it.


97564 30-May-2002 dfr

Move the definition of ElfN_Hashelt to common headers. The only platform
which has a different definition for this is alpha.


97399 28-May-2002 benno

The stack is not at the top of the user struct.


97398 28-May-2002 benno

Remove an assertion as to whether the current thread already had the FPU or
not. It may be desirable to put something similar back, but it's getting in
the way in it's current form.


97397 28-May-2002 benno

- Move macros that represent where syscall args are kept in a trapframe from
trap.c to frame.h
- Use the macros in vm_machdep.c:cpu_fork() to set up the trap frame of the
new thread.


97395 28-May-2002 benno

Remove the old prototype for kcopy. It's in cpu.h now.


97385 28-May-2002 benno

Implement pmap_copy and pmap_copy_page.


97384 28-May-2002 benno

Move the kcopy() function from trap.c to machdep.c. Add a prototype.


97347 27-May-2002 benno

Print srr1 in printtrap()

Submitted by: Peter Grehan <peterg@ptree32.com.au>


97346 27-May-2002 benno

Get the correct memory regions from OpenFirmware. We were getting the
"available" ranges, not the "physical" ranges. Clean up some of the
bootstrap code in the process.

Submitted by: Peter Grehan <peterg@ptree32.com.au>


97342 27-May-2002 benno

Use correct types in [sf]uword32.


97307 26-May-2002 dfr

Add declarations of suword32 and suword64. Add implementations of one or
the other (or both) to all the platforms. Similar for fuword32 and
fuword64.


97261 25-May-2002 jake

Make the run queue parameters machine dependent. Optimize 64 bit
architectures by using a 64 bit word for the bit array which keeps
track of non-empty queues.

Reviewed by: peter


96938 19-May-2002 benno

Make this more FreeBSD-ish.

Requested by: jhb


96906 19-May-2002 benno

- Do a quick style pass.
- Correct the implementation of fix_unaligned to use a thread, not a proc.
- GC some #if 0'd stuff.


96905 19-May-2002 benno

Add the PSL_VEC flag for AltiVec (no, it's not here yet =))


96773 17-May-2002 benno

- Rename the _C_LABEL macro to CNAME.
- Rename the _ASM_LABEL macro to ASMNAME.
- Add the HIDENAME macro which is used in libc's syscall stuff.


96771 17-May-2002 benno

Fix commenting around NetBSD version string.


96692 15-May-2002 obrien

An exact copy of i386/include/float.h will work here.


96606 14-May-2002 phk

Move MI stuff out of MD param.h files.

It can all still be overridden in the MD files should need suddenly arise.


96604 14-May-2002 phk

Remove the unused definitions of ctod() and dotc().


96499 13-May-2002 benno

FPU support.

Obtained from: NetBSD (portions)


96452 12-May-2002 benno

More locking fixes.


96443 12-May-2002 benno

Do the correct locking on processes for DSI and ISI traps.

Copied from: sparc64


96353 10-May-2002 benno

Implement the following functions:
- pmap_addr_hint
- pmap_change_wiring
- pmap_extract
- pmap_is_modified


96352 10-May-2002 benno

Install the system call trap handler.


96334 10-May-2002 benno

Improve our detection of an attempted duplicate entry. We may be trying to
change the page protection bits.


96333 10-May-2002 benno

Remove a debugging printf that escaped.


96329 10-May-2002 benno

Increase the size of the kstack.


96318 10-May-2002 obrien

Gcc 3.1 varargs support.


96255 09-May-2002 benno

Update to newer trap code from NetBSD.

Obtained from: NetBSD


96253 09-May-2002 benno

Add an assertion that we have a current pmap set before we try and return.


96252 09-May-2002 benno

The per-cpu curpmap is now set by pmap_activate. We don't need to do it here
anymore.


96251 09-May-2002 benno

- Add a prototype for the setfault() function.
- Remove some stray printf()s.


96250 09-May-2002 benno

1. Better track the executable status of mappings.
2. Set a pcpu variable to the real address of the active pmap (used when
exiting from traps.

Obtained from: NetBSD (1)


96249 09-May-2002 benno

Rename the constants for the contents of the PVR register so as not to
conflict with cpu names used in config files..


95814 30-Apr-2002 phk

Don't export timecounter structures under debug. with sysctl, they
contain no truly interesting data anymore.


95719 29-Apr-2002 benno

Commit of stuff that's been sitting in my tree for a while.

Highlights include:
- New low-level trap code from NetBSD. The high level code still needs a lot
of work.
- Fixes for some pmap handling in thread switching.
- The kernel will now get to attempting to jump into init in user mode. There
are some pmap/trap issues which prevent it from actually getting there though.

Obtained from: NetBSD (parts)


95714 29-Apr-2002 benno

- Add back calls to setfault that were removed when these functions were moved.


95710 29-Apr-2002 peter

Tidy up some loose ends.
i386/ia64/alpha - catch up to sparc64/ppc:
- replace pmap_kernel() with refs to kernel_pmap
- change kernel_pmap pointer to (&kernel_pmap_store)
(this is a speedup since ld can set these at compile/link time)
all platforms (as suggested by jake):
- gc unused pmap_reference
- gc unused pmap_destroy
- gc unused struct pmap.pm_count
(we never used pm_count - we track address space sharing at the vmspace)


95564 27-Apr-2002 alc

MFi386 1.222: Remove vm_map_growstack() and acquisition and release of Giant
around vm_fault() in trap_pfault().


95410 25-Apr-2002 marcel

Don't use the symbol name to lookup the symbol value when we can use
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.

Don't pass the symbol name to elf_reloc(), as it isn't used any more.


95121 20-Apr-2002 benno

Replace inline asm with it's inline function wrapper.


94839 16-Apr-2002 benno

Correct a comment.


94838 16-Apr-2002 benno

Implement the following functions:
- pmap_kextract
- pmap_object_init_pt
- pmap_protect
- pmap_remove_pages

I'm pretty sure pmap_remove_pages is at least somewhat bogus.


94837 16-Apr-2002 benno

Remove some dead code.


94836 16-Apr-2002 benno

Use mtsrin() instead of inline asm.


94835 16-Apr-2002 benno

Change the value of PMAP_BOOTSTRAP so we don't stomp on the PTE index value.


94834 16-Apr-2002 benno

Add inlines for mtsrin and mfsrin.


94777 15-Apr-2002 peter

Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]()
and pmap_copy_page(). This gets rid of a couple more physical addresses
in upper layers, with the eventual aim of supporting PAE and dealing with
the physical addressing mostly within pmap. (We will need either 64 bit
physical addresses or page indexes, possibly both depending on the
circumstances. Leaving this to pmap itself gives more flexibilitly.)

Reviewed by: jake
Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)


94756 15-Apr-2002 benno

Add ofwd to the GENERIC config for powerpc.


94755 15-Apr-2002 benno

Add a nexus device.

Copied from: sparc64


94753 15-Apr-2002 benno

Turn some CTR's into CTR0's.


94751 15-Apr-2002 benno

GC an extraneous prototype of delay().


94512 12-Apr-2002 mike

Include <sys/cdefs.h> for definition of __BSD_VISIBLE.

Pointy hat to: mike


94363 10-Apr-2002 mike

Remove the hack for segsz_t from <sys/types.h>; use the normal
_BSD_FOO_T_ method for defining segsz_t.


94362 10-Apr-2002 mike

Add manifest constants: _LITTLE_ENDIAN, _BIG_ENDIAN, _PDP_ENDIAN, and
_BYTE_ORDER. These are far more useful than their non-underscored
equivalents as these can be used in restricted namespace environments.
Mark the non-underscored variants as deprecated.


94275 09-Apr-2002 phk

GC various bits and pieces of USERCONFIG from all over the place.


94151 07-Apr-2002 phk

GC the "dumplo" variable, which is no longer used.

A lot of sys/*/*/machdep.c seems not to be.


93702 02-Apr-2002 jhb

- Move the MI mutexes sched_lock and Giant from being declared in the
various machdep.c's to being declared in kern_mutex.c.
- Add a new function mutex_init() used to perform early initialization
needed for mutexes such as setting up thread0's contested lock list
and initializing MI mutexes. Change the various MD startup routines
to call this function instead of duplicating all the code themselves.

Tested on: alpha, i386


93607 01-Apr-2002 dillon

Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development. Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures. Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable. Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks. They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by: jake


93467 31-Mar-2002 phk

Centralize the "bootdev" and "dumpdev" variables. They are still pretty
bogus all things considered, but at least now they don't camouflage as
being MD variables.


93452 30-Mar-2002 alc

Use the MI vm_map_growstack() instead of the MD grow_stack() in trap(). Remove
the MD grow_stack().


93273 27-Mar-2002 jeff

Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks
with this flag. Remove the dup_list and dup_ok code from subr_witness. Now
we just check for the flag instead of doing string compares.

Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface. The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.

Approved by: jhb


93264 27-Mar-2002 dillon

Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it
from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections. This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways. This should
be temporary.

Reviewed by: core
Approved by: core


93092 24-Mar-2002 obrien

Guard against redefining __gnuc_va_list.


92998 23-Mar-2002 obrien

ASM versions of __FBSDID.


92916 21-Mar-2002 benno

Collect all functions for copying to and from userspace into the one file.

This allows me to reimplement [sf]u{byte,word} as separate functions and not
as calls to copy{in,out}.


92880 21-Mar-2002 benno

- Make all inlines for manipulating supervisor-level registers accept/return
register_t values.
- Implement an inline for isync.


92875 21-Mar-2002 benno

GC some unused, bogus interrupt functions and replace them with proper
implementations of intr_disable and intr_restore.


92847 21-Mar-2002 jeff

Remove references to vm_zone.h and switch over to the new uma API.


92842 20-Mar-2002 alfred

Remove __P.

Reveiwed by: benno


92824 20-Mar-2002 jhb

Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined.
Instead of caching the ucred reference, just go ahead and eat the
decerement and increment of the refcount. Now that Giant is pushed down
into crfree(), we no longer have to get Giant in the common case. In the
case when we are actually free'ing the ucred, we would normally free it on
the next kernel entry, so the cost there is not new, just in a different
place. This also removse td_cache_ucred from struct thread. This is
still only done #ifdef DIAGNOSTIC.

Tested on: i386, alpha


92757 20-Mar-2002 benno

Increment pmap_pvo_count in the right place.


92654 19-Mar-2002 jeff

This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by: arch@


92521 17-Mar-2002 benno

Changes and fixes in preparation for UMA:

- Bootstrap pvo entries are now allocated by stealing pages.
- Just return if we're pmap_enter'ing a mapping that's already there. Don't
remove it and re-enter it.


92520 17-Mar-2002 benno

Lowercase all of the trap names.


92519 17-Mar-2002 benno

Clean up and fix up copyin and copyout.


92383 16-Mar-2002 des

Move the definition of PT_[GS]ET{,DB,FP}REGS from the MD ptrace.h to the
MI ptrace.h, since all platforms define them. Keep the MD ptrace.h around
for FIX_SSTEP (which is currently only needed on Alpha).


92067 11-Mar-2002 benno

Correct a typo. (* that should've been &)


91959 09-Mar-2002 mike

o Don't require long long support in bswap64() functions.
o In i386's <machine/endian.h>, macros have some advantages over
inlines, so change some inlines to macros.
o In i386's <machine/endian.h>, ungarbage collect word_swap_int()
(previously __uint16_swap_uint32), it has some uses on i386's with
PDP endianness.

Submitted by: bde

o Move a comment up in <machine/endian.h> that was accidentially moved
down a few revisions ago.
o Reenable userland's use of optimized inline-asm versions of
byteorder(3) functions.
o Fix ordering of prototypes vs. redefinition of byteorder(3)
functions, so that the non-GCC (libc asm) case has proper
prototypes.
o Add proper prototypes for byteorder(3) functions in <sys/param.h>.
o Prevent redundant duplicate prototypes by making use of the
_BYTEORDER_PROTOTYPED define.
o Move the bswap16(), bswap32(), bswap64() C functions into MD space
for platforms in which asm versions don't exist. This significantly
reduces the complexity of some things at the cost of duplicate code.

Reviewed by: bde


91803 07-Mar-2002 benno

Install the DSI and ISI trap handlers and their appropriate locations.


91802 07-Mar-2002 benno

Copy the "implementation" of pmap_prefault from sparc64.


91794 07-Mar-2002 benno

Move tunable initialisation so it can get access to physmem.


91793 07-Mar-2002 benno

Calculate physmem.


91504 28-Feb-2002 arr

- Move a comment from being on the same line as a #ifdef to the line
following it. This should have gone in the previous commit, but
misviewed Bruce's patch.

Requested by: bde


91486 28-Feb-2002 benno

cpu_switch now works, for kthreads at least.


91485 28-Feb-2002 benno

Various cleanups.


91484 28-Feb-2002 benno

- Prevent the decrementer interrupt handler from nesting.
- Catch some more cases of PSL_EE and PSL_RI getting out of sync.


91483 28-Feb-2002 benno

- Modify pmap_activate so it only marks the pmap as active.
- Add a pmap_deactivate function.


91477 28-Feb-2002 benno

GC an unused variable in cpu_fork().


91475 28-Feb-2002 arr

- Fix panic() message and a couple style nits that snuck in from the
recent diagnostics commit (rev. 1.84).


91467 28-Feb-2002 benno

Make fork work, at least for kthreads. Switching still has some issues.


91465 28-Feb-2002 benno

- Rearrange the sequence of events in powerpc_init() somewhat.
- Catch another instance of PSL_EE being cleared without PSL_RI.


91461 28-Feb-2002 benno

- When enabling/disabling interrupts, set/clear both PSL_EE and PSL_RI, not
just PSL_EE.
- Make cpu_critical_enter/exit independant of save_intr/restore_intr.


91459 28-Feb-2002 benno

Add a missing (.


91456 28-Feb-2002 benno

Implement the following functions:
- pmap_remove
- pmap_kremove
- pmap_qremove


91455 28-Feb-2002 benno

Remove most of the usage of critical_enter/exit.

I put these in to match the use of spl*() in the NetBSD code I was basing this
on, but it appears to cause problems.

I'm doing this in a separate commit so as to be able to refer back if locking
becomes an issue at a later stage.


91403 27-Feb-2002 silby

Fix a horribly suboptimal algorithm in the vm_daemon.

In order to determine what to page out, the vm_daemon checks
reference bits on all pages belonging to all processes. Unfortunately,
the algorithm used reacted badly with shared pages; each shared page
would be checked once per process sharing it; this caused an O(N^2)
growth of tlb invalidations. The algorithm has been changed so that
each page will be checked only 16 times.

Prior to this change, a fork/sleepbomb of 1300 processes could cause
the vm_daemon to take over 60 seconds to complete, effectively
freezing the system for that time period. With this change
in place, the vm_daemon completes in less than a second. Any system
with hundreds of processes sharing pages should benefit from this change.

Note that the vm_daemon is only run when the system is under extreme
memory pressure. It is likely that many people with loaded systems saw
no symptoms of this problem until they reached the point where swapping
began.

Special thanks go to dillon, peter, and Chuck Cranor, who helped me
get up to speed with vm internals.

PR: 33542, 20393
Reviewed by: dillon
MFC after: 1 week


91394 27-Feb-2002 tmm

Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
versa, using a naming scheme like le16toh(), htole16().
These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
conversion (while the normal access functions would if the bus endianess
differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by: mike, bde
Tested on alpha by: mike


91293 26-Feb-2002 benno

Add makeoptions NO_WERROR=true so that we can build. =)


91207 24-Feb-2002 benno

Make atomic_cmpset_32 correctly return 0 on failure.


91131 23-Feb-2002 benno

Don't call critical_enter()/critical_exit() around calls to pmap_pvo_enter()
as it does it's own handling of critical sections.


91090 22-Feb-2002 julian

Add some DIAGNOSTIC code.
While in userland, keep the thread's ucred reference in a shadow
field so that the usual place to store it is NULL.
If DIAGNOSTIC is not set, the thread ucred is kept valid until the next
kernel entry, at which time it is checked against the process cred
and possibly corrected. Produces a BIG speedup in
kernels with INVARIANTS set. (A previous commit corrected it
for the non INVARIANTS case already)

Reviewed by: dillon@freebsd.org


90895 19-Feb-2002 julian

Add change to teh PPC to keep it in step with i386 and MI code

Pointy hat this direction please...


90868 18-Feb-2002 mike

o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on: alpha, i386
Reviewed by: bde, jake, tmm


90833 18-Feb-2002 obrien

style(9)


90711 15-Feb-2002 wollman

Resurrect one of the easiest changes from my big include files roll-up
patch from a year ago: give file flags their own type. This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().


90643 14-Feb-2002 benno

Complete rework of the PowerPC pmap and a number of other bits in the early
boot sequence.

The new pmap.c is based on NetBSD's newer pmap.c (for the mpc6xx processors)
which is 70% faster than the older code that the original pmap.c was based
on. It has also been based on the framework established by jake's initial
sparc64 pmap.c.

There is no change to how far the kernel gets (it makes it to the mountroot
prompt in psim) but the new pmap code is a lot cleaner.

Obtained from: NetBSD (pmap code)


90361 07-Feb-2002 julian

Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,


90344 07-Feb-2002 phk

GC the PC_SWITCH* symbols which are not used in assembly anymore.


90065 01-Feb-2002 bde

Compile osigreturn() unconditionally since it will always be needed on
some arches and the syscall table is machine-independent. It was
(bogusly) conditional on COMPAT_43, so this usually makes no difference.

ia64: in addition:
- replace the bogus cloned comment before osigreturn() by a correct one.
osigreturn() is just a stub fo ia64's.
- fix the formatting of cloned comment before sigreturn().
- fix the return code. use nosys() instead of returning ENOSYS to get
the same semantics as if the syscall is not in the syscall table.
Generating SIGSYS is actually correct here.
- fix style bugs.

powerpc: copy the cleaned up ia64 stub. This mainly fixes a bogus comment.

sparc64: copy the cleaned up the ia64 stub, since there was no stub before.


89919 28-Jan-2002 gallatin

Simple fixes to get the powerpc kernel compiling again.

Reviewed by: mp


88088 18-Dec-2001 jhb

Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().

Tested on: i386, alpha


87702 11-Dec-2001 jhb

Overhaul the per-CPU support a bit:

- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.

Tested on: alpha, i386
Reviewed by: peter, jake


87599 10-Dec-2001 obrien

Update to C99, s/__FUNCTION__/__func__/,
also don't use ANSI string concatenation.


87572 09-Dec-2001 obrien

style(9)


87546 09-Dec-2001 dillon

Allow maxusers to be specified as 0 in the kernel config, which will
cause the system to auto-size to between 32 and 512 depending on the
amount of memory.

MFC after: 1 week


87454 06-Dec-2001 jhb

Add multiple inclusion protection.


87158 01-Dec-2001 mike

o Stop abusing MD headers with non-MD types.
o Hide nonstandard functions and types in <netinet/in.h> when
_POSIX_SOURCE is defined.
o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>.
o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new
__FBSDID() macro.
o Fix some miscellaneous issues in <arpa/inet.h>.
o Correct final argument for the inet_ntop() function (POSIX.1-200x).
o Get rid of the namespace pollution from <sys/types.h> in
<arpa/inet.h>.

Reviewed by: fenner
Partially submitted by: bde


86336 14-Nov-2001 jhb

The interrupt nesting level is per-thread not per-CPU on FreeBSD.


86311 13-Nov-2001 mp

Don't enable FP in the kernel. It is not needed when -msoft-float is used.

Reminded by: benno


86067 05-Nov-2001 mp

Clean up the trap handling code and make it consistent with the other platforms.

Submitted by: jhb


86066 05-Nov-2001 mp

Add enable_fpu/save_fpu for handling the floating point registers in the PCB.

Obtained from: NetBSD


85892 02-Nov-2001 mike

o Add new header <sys/stdint.h>.
o Make <stdint.h> a symbolic link to <sys/stdint.h>.
o Move most of <sys/inttypes.h> into <sys/stdint.h>, as per C99.
o Remove <sys/inttypes.h>.
o Adjust includes in sys/types.h and boot/efi/include/ia64/efibind.h
to reflect new location of integer types in <sys/stdint.h>.
o Remove previously symbolicly linked <inttypes.h>, instead create a
new file.
o Add MD headers <machine/_inttypes.h> from NetBSD.
o Include <sys/stdint.h> in <inttypes.h>, as required by C99; and
include <machine/_inttypes.h> in <inttypes.h>, to fill in the
remaining requirements for <inttypes.h>.
o Add additional integer types in <machine/ansi.h> and
<machine/limits.h> which are included via <sys/stdint.h>.

Partially obtain from: NetBSD
Tested on: alpha, i386
Discussed on: freebsd-standards@bostonradio.org
Reviewed by: bde, fenner, obrien, wollman


85335 23-Oct-2001 mike

Remove funky right justification.

Pointed out by: bde


85297 21-Oct-2001 des

Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to
proc_* in the process; procfs_machdep.c is no longer needed.

Run-tested on i386, build-tested on Alpha, untested on other platforms.


85294 21-Oct-2001 des

[partially forced commit due to pilot error in earlier commit attempt]

{set,fill}_{,fp,db}regs() fixup:

- Add dummy {set,fill}_dbregs() on architectures that don't have them.

- KSEfy the powerpc versions (struct proc -> struct thread).

- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.

These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
one yet.

Run-tested on i386, build-tested on Alpha, untested on other platforms.


85201 19-Oct-2001 mp

Fix includes based on recent changes to lock.h, mutex.h and ktr.h.


85187 19-Oct-2001 obrien

Try two on the preprocessing logic.

Reviewed by: ru


85169 19-Oct-2001 mp

Cleanup of the stdarg code.

Submitted by: ru


85113 18-Oct-2001 mp

Add support for the gcc-2.95 stdarg implementation.


85108 18-Oct-2001 obrien

My attempts at minimizing the number of #def's got me in trouble.


85085 18-Oct-2001 obrien

Add support for "__gnuc_va_list". Some overly "smart" libraries assume
the existence of the __gnuc_va_list type[*] because our compiler is GCC.

[*] __gnuc_va_list is defined in the GCC ginclude/stdarg.h replacement
headerwhich we don't use.


84977 15-Oct-2001 benno

Flesh out cpu_fork() and cpu_set_fork_handler(). This is a work in progress.


84976 15-Oct-2001 benno

- Correct the type of the argument to delay() so as to not conflict with
sys/boot/common/bootstrap.h.
- Add a prototype for fork_trampoline().


84946 15-Oct-2001 mp

Fix typo.


84945 15-Oct-2001 mp

Save WIP. Partial rewrite of cpu_switch() and savectx(). This makes it closer
to working but still needs some work to properly switch the full context
(such as saving the fpu registers, switch stacks, etc.). Also, remove some
dead code that was mixed in.


84921 14-Oct-2001 benno

Implement pmap_mapdev.


84857 12-Oct-2001 mp

Add memory disk support to allow the boot process to proceed a bit further.


84856 12-Oct-2001 mp

Modify a virtual address check to allow use of the openfirmware callback
used by the PowerPC simulator (PSIM).


84855 12-Oct-2001 mp

Add standard calls to device_add_child() and root_bus_configure().


84783 10-Oct-2001 ps

Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by: peter
MFC after: 2 weeks


84643 08-Oct-2001 mp

Add a call to init_param() to initialize some necessary variables.


84637 07-Oct-2001 des

Dissociate ptrace from procfs.

Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested. Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.

This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().

It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".


84381 02-Oct-2001 mjacob

Fix problem where a user buffer outside of the area being tested
will be corrupted.

PR: 29194
Obtained from: Tor.Egge@fast.no
MFC after: 2 weeks


83870 24-Sep-2001 mp

Catch up to recent removal of curpcb from globals.h.


83730 20-Sep-2001 mp

Add missing include file.


83729 20-Sep-2001 mp

Don't include NFS headers.

Reported by: dfr


83691 20-Sep-2001 peter

Replicate a change from alpha/genassym.c to other arches. This should
fix nfs-related build breakage.


83683 20-Sep-2001 mp

Use BATL/BATU macros instead of hardcoded hex constants.


83682 20-Sep-2001 mp

Update PowerPC MD code to compile and do initial bootstrap based on
recent changes (KSE and VM requiring physmem to be setup).

Reviewed by: benno, jhb, julian


83651 18-Sep-2001 peter

Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.


83645 18-Sep-2001 jhb

GC obsolete cruft from this file.


83644 18-Sep-2001 jhb

Whitespace fixes.


83643 18-Sep-2001 jhb

- If we ever do the per-cpu KTR stuff, the index won't be volatile as it
will be private to each CPU.
- Re-style(9) the globaldata structures. There really needs to be a MI
struct pcpu that has a MD struct mdpcpu member at some point.


83642 18-Sep-2001 jhb

- Fix a missed idleproc -> idlethread conversion.
- Remove redundany fpucurproc (fpucurthread already existed)


83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


83276 10-Sep-2001 peter

Rip some well duplicated code out of cpu_wait() and cpu_exit() and move
it to the MI area. KSE touched cpu_wait() which had the same change
replicated five ways for each platform. Now it can just do it once.
The only MD parts seemed to be dealing with fpu state cleanup and things
like vm86 cleanup on x86. The rest was identical.

XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional
stub in place.

Reviewed by: jake, tmm, dillon


83223 08-Sep-2001 peter

Missing part of dillon's coredump commit. cpu_coredump() was still
passing IO_NODELOCKED to vn_rdwr(), this would cause operations on the
unlocked core vnode and softupdates nastiness if an a.out binary cored.


83047 05-Sep-2001 obrien

style(9) the structure definitions.


82939 04-Sep-2001 peter

Zap #if 0'ed map init code that got moved to the MI area.
Convert the powerpc tree to use the common code.


82938 04-Sep-2001 peter

Nuke #if 0'ed "setredzone()" stub. We never used it, and probably
never will. I've implemented an optional redzone as part of the KSE
upage breakup.


82708 01-Sep-2001 jhb

Axe stale mp_fixme().


82634 31-Aug-2001 peter

Similar to changes on i386/alpha/etc pmap.c; converge on a similar
look/feel on pmap_new_proc() with some cosmetic style changes.


82530 30-Aug-2001 mike

o Remove some GCCisms in src/powerpc/include/endian.h.
o Unify <machine/endian.h>'s across all architectures.
o Make bswapXX() functions use a different spelling of u_int16_t and
friends to reduce namespace pollution. The bswapXX() functions
don't actually exist, but we'll probably import these at some
point. Atleast one driver (if_de) depends on bswapXX() for big
endian cases.
o Deprecate byteorder(3) prototypes from <sys/types.h>, these are
now prototyped indirectly in <arpa/inet.h>.
o Deprecate in_addr_t and in_port_t typedefs in <sys/types.h>, these
are now typedef'd in <arpa/inet.h>.
o Change byteorder(3) prototypes to use standards compliant uint32_t
(spelled __uint32_t to reduce namespace pollution).
o Document new preferred headers and standards compliance.

Discussed with: bde
PR: 29946
Reviewed by: bmilekic


82313 25-Aug-2001 peter

vm_page_zero_idle() is no longer MD.


82025 21-Aug-2001 peter

Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this
stuff.


81766 16-Aug-2001 obrien

Minor style(9)'ing


81763 16-Aug-2001 obrien

style(9) and make consistent across platforms


81727 15-Aug-2001 ache

OFF_T -> OFF (more standard style)


81726 15-Aug-2001 jhb

The 'astpending' variable is already declared in trap.c (and unused in
FreeBSD besides).


81724 15-Aug-2001 jhb

FreeBSD doesn't use a want_resched variable. Instead, the PS_NEEDRESCHED
p_sflag is managed in a MI fashion.


81720 15-Aug-2001 ache

Add OFF_T_MAX/OFF_T_MIN


81675 15-Aug-2001 obrien

Style changes to commonize the various platforms.


81493 10-Aug-2001 jhb

- Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by: bde (mostly)


81265 08-Aug-2001 peter

Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.

gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.


81139 04-Aug-2001 jhb

Axe unused and invalid astpending globaldata member.


81138 04-Aug-2001 jhb

Axe unused and invalid GD_ASTPENDING symbol.


80700 31-Jul-2001 jake

Use a machine dependent type, Elf_Hashelt, for the elements of the elf
dynamic symbol table buckets and chains. The sparc64 toolchain uses 32
bit .hash entries, unlike other 64 bits architectures (alpha), which use
64 bit entries.

Discussed with: dfr, jdp


80431 27-Jul-2001 peter

Make PMAP_SHPGPERPROC tunable. One shouldn't need to recompile a kernel
for this, since it is easy to run into with large systems with lots of
shared mmap space.

Obtained from: yahoo


80399 26-Jul-2001 bmilekic

- Do not handle the per-CPU containers in mbuf code as though the cpuids
were indices in a dense array. The cpuids are a sparse set and treat
them as such, setting up containers only for CPUs activated during
mb_init().

- Fix netstat(1) and systat(1) to treat the per-CPU stats area as a sparse
map, in accordance with the above.

This allows us to properly boot with certain CPUs disactivated. However, if
we later decide to re-activate said CPUs, we will barf until we decide to
implement CPU spinon/spinoff callback hooks to allow for said CPUs' per-CPU
containers to get configured on their activation.

Reported by: mjacob
Partially (sys/ diffs) Submitted by: mjacob


79265 05-Jul-2001 dillon

Move vm_page_zero_idle() from machine-dependant sections to a
machine-independant source file, vm/vm_zeroidle.c. It was exactly the
same for all platforms and updating them all was getting annoying.


79263 04-Jul-2001 dillon

Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.

vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc... vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)


79224 04-Jul-2001 dillon

With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage). Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.


79123 03-Jul-2001 jhb

Allow Giant to be recursed when a process terminates.


79053 01-Jul-2001 imp

Don't need the .keep_me files. Obrien and I committed past each other.

Add 0-9 to the list of possible kernel names at matsushita-san's
suggestion.

Submitted by: Makoto MATSUSHITA-san <matusita@jp.FreeBSD.org>


79038 01-Jul-2001 benno

Don't include machine/autoconf.h for now. It's not used and is breaking the
build.


79037 01-Jul-2001 benno

Register definitions for the OpenPIC used in various models of
iMac/PowerMac/iBook/PowerBook.

Obtained from: NetBSD


79036 01-Jul-2001 benno

Add TRAPF_* macros required by MI-ification of ast() and userret().

Submitted by: Mark Peek <mark@whistle.com>


79030 30-Jun-2001 obrien

Ensure sys/${MACHINE}/compile/FOO exists

Reviewed by: arch, imp, peter, and the USENIX terminal room secret kernel cabal


79026 30-Jun-2001 imp

Really do proper keepme files in the compile directories. Use
.cvsignore file for [A-Za-z]* to keep these directories around rather
than waste a file on .keepme. This should also make people's built
trees place nice with CVS.

Idea for .cvsignore: peter (although I suggested the regexp)
Pointed out by: Makoto MATSUSHITA-san <matusita@jp.FreeBSD.org>
Llama's costuming by: Fernamdo Llamas


79019 30-Jun-2001 obrien

Ensure sys/${MACHINE}/compile/FOO exists

Reviewed by: arch, imp, peter and
the USENIX terminal room secret kernel cabal


78983 29-Jun-2001 jhb

Move ast() and userret() to sys/kern/subr_trap.c now that they are MI.


78962 29-Jun-2001 jhb

Add a new MI pointer to the process' trapframe p_frame instead of using
various differently named pointers buried under p_md.

Reviewed by: jake (in principle)


78925 28-Jun-2001 benno

Put back the two semicolons I accidentally lost while reformatting this to
bring it closer to style(9).

Spotted by: Mark Peek <mark@whistle.com>


78883 27-Jun-2001 benno

Code for dealing with external interrupts.

Obtained from: NetBSD


78880 27-Jun-2001 benno

Fix comment breakage.


78878 27-Jun-2001 benno

Fix the atomic_*_32 operations. These were written before I had the ability
to test them properly and before I had a working knowledge of GCC asm
constraints.


78823 26-Jun-2001 benno

Don't initialise ret in atomic_cmpset_32.
Add more synchronisation.


78693 24-Jun-2001 benno

Fix asm constraints for atomic_cmpset_32. This fix may also be needed
elsewhere.


78467 19-Jun-2001 benno

More verbose version of identifycpu() which also contains many more CPU
versions/revisions.

Modified from the original patch to mark G3 and G4 processors as such.

Submitted by: Jeff Schottmiller <jeff@neoscale.com>


78388 17-Jun-2001 benno

The final commit for the first phase of PowerPC support.

This adds the config stuff needed to build kernels.

Reviewed by: obrien


78342 16-Jun-2001 benno

This commit (along with one pending in sys/dev/ofw and one in sys/conf) give
us our first minimal glimpse of PowerPC support.

With this code we can get to the "mountroot>" prompt on my Apple iMac. We
can't get any further due to lack of clock and interrupt handling, among other
things. This does however mean that pmap and VM are initialising.

We're fairly dependant on OpenFirmware at this point, but I hope to add
support for other classes of firmware at a later stage.

Reviewed by: obrien, dfr


78306 15-Jun-2001 obrien

Add CVS id.


77957 10-Jun-2001 benno

Bring in NetBSD code used in the PowerPC port.

Reviewed by: obrien, dfr
Obtained from: NetBSD


77944 09-Jun-2001 obrien

fix RCS ID style nit


77933 09-Jun-2001 obrien

ID style nit.


77930 09-Jun-2001 obrien

Style fix FreeBSD ID, and change continuation style slightly.


77626 02-Jun-2001 phk

Properly wrap mtx_intr_enable() macro in "do $bla while (0)"


77442 29-May-2001 jhb

GC #if 0'd calls to releasing and acquiring the old style giant kernel
lock.


77031 23-May-2001 ru

- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file
systems were repo-copied from sys/miscfs to sys/fs.

- Renamed the following file systems and their modules:
fdesc -> fdescfs, portal -> portalfs, union -> unionfs.

- Renamed corresponding kernel options:
FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.

- Install header files for the above file systems.

- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
Makefiles.


76932 21-May-2001 gallatin

catch these files up to their i386 neighbors to make alpha boot
prior to the vm_mtx


76785 18-May-2001 obrien

Make _BSD_TIME_T_ (time_t) an `int' rather than `long'. This will help
flag errors where programmers assume time_t is a long, which it is not on
64-bit platforms.

Submitted by: bde


76784 18-May-2001 obrien

Style changes -- revert ordering to mostly two revs ago.
Embellish some comments, fix tab'ing.

Requested by: bde


76701 16-May-2001 obrien

Consistently define the rune types.
Follow NetBSD's lead and add a _BSD_MBSTATE_T_ type.


76700 16-May-2001 obrien

Move the int typedefs to the top so they can be used in defining other types.
Ensure every platform has __offsetof.
Make multiple inclusion detection consistent with other
<platform>/include/*.h files.


76659 16-May-2001 jhb

Lock the procfs functions for doing a single step and reading/writing
registers better. Hold sched_lock not only for checking the flag but
also while performing the actual operation to ensure the process doesn't
get swapped out by another CPU while we the operation is being performed.


76442 10-May-2001 jhb

Trim lots of stuff that is now in MI code along with MD alpha code.


76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


76078 27-Apr-2001 jhb

Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by: jake, peter
Looked over by: eivind


76056 26-Apr-2001 jhb

Initialize p_md.md_kernnest to 1 for newly fork'd processes since they
start off in the kernel.


75925 24-Apr-2001 jhb

Add a new field 'md_kernnest' to the alpha machine dependent process
structure. This field keeps track of how many levels deep we are nested
into the kernel. The nesting level is bumped at the start of a trap,
interrupt, syscall, or exception and is decremented on return. This is
used to detect the case when the kernel is returning back to a kernel
context in exception_return(). If we are returning to the kernel we need
to update the globaldata pointer register saved in the stack frame in case
we have switched CPU's between taking the initial interrupt that saved the
frame and returning. If we don't do this fixup it is possible for a CPU to
use the wrong per-cpu data. On UP systems this is not a problem, so the
code is conditional on SMP.

A count was used instead of simply checking the process status register in
the frame during exception_return() since there are critical sections at
the very start and end of a trap, exception, or interrupt from userland in
which we could trash the t7 register being used in userland. The counter
is incremented after adn before these critical sections respectively so
that we will not overwrite the saved t7 register if we are interrupted
during one of these critical sections.


75873 23-Apr-2001 mjacob

Fix includes so it compiles again.


75681 18-Apr-2001 jhb

Convert the protection of hte i8254 from critical_enter/exit like it is
on the x86.


75570 17-Apr-2001 jhb

Blow away the panic mutex in favor of using a single atomic_cmpset() on a
panic_cpu shared variable. I used a simple atomic operation here instead
of a spin lock as it seemed to be excessive overhead. Also, this can avoid
recursive panics if, for example, witness is broken.


74912 28-Mar-2001 jhb

Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.


74900 28-Mar-2001 jhb

- Switch from using save/disable/restore_intr to using critical_enter/exit
and change the u_int mtx_saveintr member of struct mtx to a critical_t
mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules. Change all the
_mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
parameters and rename them to use a prefix of two underscores. Inside
of kern_mutex.c, generate wrapper functions for
_mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
that are called from modules. The macros mtx_{un,}lock_{spin,}_flags()
are mapped to the __mtx_* macros inside of the kernel to inline the
usual case of mutex operations and map to the internal _mtx_* functions
in the module case so that modules will use WITNESS and KTR logging if
the kernel is compiled with support for it.


74891 28-Mar-2001 jhb

- Include <machine/prom.h> to get the prototype for prom_halt().
- If there is no gdb device, just return without trying to return any
value since gdb_handle_exception() returns void.
- When calling prom_halt(), pass in a value telling it to actually halt
and not to randomly choose whether or not to halt or reboot depending on
whatever value happened to be in a0 when the call was made.


74746 24-Mar-2001 ume

Unbreak build on alpha.
- Move in_port_t to sys/types.h.
- Nuke in_addr_t from each endian.h.

Reported by: jhb


74384 17-Mar-2001 peter

Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash).
Make the name cache hash as well as the nfsnode hash use it.

As a special tweak, create an unsigned version of register_t. This allows
us to use a special tweak for the 64 bit versions that significantly
speeds up the i386 version (ie: int64 XOR int64 is slower than int64
XOR int32).

The code layout is a little strange for the string function, but I was
able to get between 5 to 10% improvement over the original version I
started with. The layout affects gcc code generation choices and this way
was fastest on x86 and alpha.

Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is
around 45% faster with -march=pentiumpro on a p6 cpu.


74271 15-Mar-2001 gallatin

remove bogus check -- for kernel threads we fork off of proc0, not curproc
This was causing panics when modules which create kthreads were loaded
after boot.

pointed out by: jake, jhb


74016 09-Mar-2001 jhb

Fix mtx_legal2block. The only time that it is bad to block on a mutex is
if we hold a spin mutex, since we can trivially get into deadlocks if we
start switching out of processes that hold spinlocks. Checking to see if
interrupts were disabled was a sort of cheap way of doing this since most
of the time interrupts were only disabled when holding a spin lock. At
least on the i386. To fix this properly, use a per-process counter
p_spinlocks that counts the number of spin locks currently held, and
instead of checking to see if interrupts are disabled in the witness code,
check to see if we hold any spin locks. Since child processes always
start up with the sched lock magically held in fork_exit(), we initialize
p_spinlocks to 1 for child processes. Note that proc0 doesn't go through
fork_exit(), so it starts with no spin locks held.

Consulting from: cp


73922 07-Mar-2001 jhb

Use the proc lock to protect p_pptr when waking up our parent in cpu_exit()
and remove the mpfixme() message that is now fixed.


72906 22-Feb-2001 jhb

Rename switch_trampoline() to fork_trampoline() on the alpha and ia64.

Suggested by: dfr


72897 22-Feb-2001 jhb

GC unused and now obsolete assertion macros.


72746 20-Feb-2001 jhb

- Don't call clear_resched() in userret(), instead, clear the resched flag
in mi_switch() just before calling cpu_switch() so that the first switch
after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
proc in mi_switch(), and handle the sched_lock state change across a
context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
have to setup an initial state for sched_lock in fork_exit() before we
release it.


72569 17-Feb-2001 ume

Correct disordering which is corresponding to bde's fix to
i386/include/ansi.h.


72510 15-Feb-2001 ume

Correct 2nd argument of getnameinfo(3) to socklen_t.

Reviewed by: itojun


72358 11-Feb-2001 markm

RIP <machine/lock.h>.

Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by: jhb


72276 10-Feb-2001 jhb

- Make astpending and need_resched process attributes rather than CPU
attributes. This is needed for AST's to be properly posted in a preemptive
kernel. They are backed by two new flags in p_sflag: PS_ASTPENDING and
PS_NEEDRESCHED. They are still accesssed by their old macros:
aston(), astoff(), etc. For completeness, an astpending() macro has been
added to check for a pending AST, and clear_resched() has been added to
clear need_resched().
- Rename syscall2() on the x86 back to syscall() to be consistent with
other architectures.


72274 10-Feb-2001 jhb

Add a macro mtx_intr_enable() to alter a spin lock such that interrupts
will be enabled when it is released.


72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


71881 31-Jan-2001 dfr

* Move exception_return to exception.s which is a more logical home for it.
* Optimise the return path for syscalls so that they only restore a minimal
set of registers instead of performing a full exception_return.

A new flag in the trapframe indicates that the frame only holds partial
state. When it is necessary to perform a full state restore (e.g. after an
execve or signal), the flag is cleared to force a full restore.


71694 26-Jan-2001 jhb

Update some comments, s0 in the pcb of a child returning from fork1() is
now passed in as a0 to fork_exit() and and s2 is passed in as a1.


71604 24-Jan-2001 jhb

- Change fork_exit() to take a pointer to a trapframe as its 3rd argument
instead of a trapframe directly. (Requested by bde.)
- Convert the alpha switch_trampoline to call fork_exit() and use the MI
fork_return() instead of child_return().
- Axe child_return().


71576 24-Jan-2001 jasone

Convert all simplelocks to mutexes and remove the simplelock implementations.


71544 24-Jan-2001 jhb

- Rename the gd_cpuno member of struct globaldata to gd_cpuid.
- Add a globaldata_register() prototype in the SMP case.


71541 24-Jan-2001 jhb

- Proc locking.
- P_INMEM -> PS_INMEM.


71536 24-Jan-2001 jhb

cpuno -> cpuid.


71352 21-Jan-2001 jasone

Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex
inline functions non-inlined. Hide parts of the mutex implementation that
should not be exposed.

Make sure that WITNESS code is not executed during boot until the mutexes
are fully initialized by SI_SUB_MUTEX (the original motivation for this
commit).

Submitted by: peter


71337 21-Jan-2001 jake

Make intr_nesting_level per-process, rather than per-cpu. Setup
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context. This is also necessary
in order to context switch from sched_ithd() directly.

Reviewed By: peter


70952 12-Jan-2001 jake

Remove unused per-cpu variables inside_intr and ss_eflags.


70928 11-Jan-2001 jake

- Remove compatibility macros for accessing per-cpu variables.
__FreeBSD_version 500015 can be used to detect their disappearance.
- Move the symbols for SMP_prvspace and lapic from globals.s to
locore.s.
- Remove globals.s with extreme prejudice.


70786 08-Jan-2001 obrien

Remove seconds types we don't use that came in thru the NetBSD heiratage.


70741 07-Jan-2001 benno

PowerPC atomic operation functions.
Some of these are dependant on an inline function (powerpc_mb()) that is
yet to come.

Reviewed by: obrien


70740 07-Jan-2001 benno

PowerPC assembler #defines.

Reviewed by: obrien
Obtained from: NetBSD


70723 06-Jan-2001 jake

Implement accessors for per-cpu variables which don't depend on the
symbols in globals.s.

PCPU_GET(name) returns the value of the per-cpu variable
PCPU_PTR(name) returns a pointer to the per-cpu variable
PCPU_SET(name, val) sets the value of the per-cpu variable

In general these are not yet used, compatibility macros remain.

Unifdef SMP struct globaldata, this makes variables such as cpuid
available for UP as well.

Rebuilding modules is probably a good idea, but I believe old
modules will still work, as most of the old infrastructure
remains.


70587 02-Jan-2001 obrien

PowerPC platform-specific definitions (modeled on sys/i386/include/setjmp.h)


70586 02-Jan-2001 obrien

PowerPC platform-specific definitions (modeled on sys/i386/include/types.h)


70585 02-Jan-2001 obrien

Minor style tweaks.


70584 02-Jan-2001 obrien

PowerPC platform-specific definitions (modeled on sys/i386/include/param.h)


70583 01-Jan-2001 obrien

MP shells for the PowerPC platform.


70578 01-Jan-2001 obrien

PowerPC platform-specific page size setting.


70576 01-Jan-2001 obrien

PowerPC platform-specific definitions.

Obtained from: NetBSD (parts)


70572 01-Jan-2001 obrien

Shells for the atomic operations FreeBSD needs.
This is just waiting for a budding PowerPC ASM guy to fill in the blanks.


70571 01-Jan-2001 obrien

PowerPC platform-specific type definitions.


70566 01-Jan-2001 obrien

PowerPC specific ELF ABI definitions.


70317 23-Dec-2000 jake

Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by: jhb, -smp@


69805 09-Dec-2000 mjacob

Store in globaldata our CPU ID#. Provide a lock for panics - only one
CPU can panic at a time.
Obtained from:Andrew Gallatin <gallatin@cs.duke.edu>


69783 08-Dec-2000 msmith

Next phase in the PCI subsystem cleanup.

- Move PCI core code to dev/pci.
- Split bridge code out into separate modules.
- Remove the descriptive strings from the bridge drivers. If you
want to know what a device is, use pciconf. Add support for
broadly identifying devices based on class/subclass, and for
parsing a preloaded device identification database so that if
you want to waste the memory, you can identify *anything* we know
about.
- Remove machine-dependant code from the core PCI code. APIC interrupt
mapping is performed by shadowing the intline register in machine-
dependant code.
- Bring interrupt routing support to the Alpha
(although many platforms don't yet support routing or mapping
interrupts entirely correctly). This resulted in spamming
<sys/bus.h> into more places than it really should have gone.
- Put sys/dev on the kernel/modules include path. This avoids
having to change *all* the pci*.h includes.


69586 05-Dec-2000 jake

Remove the last of the MD netisr code. It is now all MI. Remove
spending, which was unused now that all software interrupts have
their own thread. Make the legacy schednetisr use an atomic op
for setting bits in the netisr mask.

Reviewed by: jhb


69488 01-Dec-2000 gallatin

acquire/release Giant in vm_page_zero_idle(), like on i386

Discused with: jhb


68900 19-Nov-2000 dfr

Convert various calls to splhigh() to disable_intr() since splhigh() is
now a no-op.


68899 19-Nov-2000 dfr

We don't need <stddef.h> for offsetof() any more.


68784 15-Nov-2000 jhb

Add the 'witness_spin_check' per-CPU variable.


68763 15-Nov-2000 jhb

Fix all the interrupt enabled/disabled assertions which were backwards.


68762 15-Nov-2000 jhb

Don't perform an mi_switch() when we release Giant during cpu_exit(). We
are about to call cpu_switch() anyways.

Found by: witness


68549 10-Nov-2000 benno

Beginnings of the powerpc machine dependant includes.

Reviewed by: obrien
Obtained from: NetBSD


68330 04-Nov-2000 obrien

Our SHRT_MIN definition was actually 4 bits too big.

Submitted by: Bradley T. Hughes <bhughes@trolltech.com>


67551 25-Oct-2000 jhb

- Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.

Submitted by: cp


67487 24-Oct-2000 obrien

* Update comments
* convert decimal constants to hex
Submitted by: bde

* Add ISO-C99 long long limits


67473 23-Oct-2000 mjacob

Move bogus proc reference stuff into <machine/globals.h>. There is no
more include file including <sys/proc.h>, but there still is this wonky
and (causes warnings on i386) reference in globals.h.

CURTHD is now defined in <machine/globals.h> as well. The correct thing
to do is provide a platform function for this.


67403 20-Oct-2000 jhb

Define the mtx_legal2block() macro used in the witness code that managed
to get lost during the MI mutex conversion.

Reported by: Steve Kargl <sgk@troutmask.apl.washington.edu>


67398 20-Oct-2000 jhb

Fix a braino in the ASS_SIEN() macro in the MUTEX_DEBUG case by using
mtx_saveintr instead of saveintr.


67393 20-Oct-2000 jhb

Catch up to some of the changes to _getlock_spin_block. Specifically,
use _obtain_lock() instead of a manual atomic_cmpset_ptr.


67365 20-Oct-2000 jhb

Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h


67358 20-Oct-2000 jhb

- machine/mutex.h -> sys/mutex.h
- Catch up to the MI mutex structure due to saveflags,saveipl,savepsr
becoming saveintr.


67352 20-Oct-2000 jhb

- Make the mutex code almost completely machine independent. This greatly
reducues the maintenance load for the mutex code. The only MD portions
of the mutex code are in machine/mutex.h now, which include the assembly
macros for handling mutexes as well as optionally overriding the mutex
micro-operations. For example, we use optimized micro-ops on the x86
platform #ifndef I386_CPU.
- Change the behavior of the SMP_DEBUG kernel option. In the new code,
mtx_assert() only depends on INVARIANTS, allowing other kernel developers
to have working mutex assertiions without having to include all of the
mutex debugging code. The SMP_DEBUG kernel option has been renamed to
MUTEX_DEBUG and now just controls extra mutex debugging code.
- Abolish the ugly mtx_f hack. Instead, we dynamically allocate
seperate mtx_debug structures on the fly in mtx_init, except for mutexes
that are initiated very early in the boot process. These mutexes
are declared using a special MUTEX_DECLARE() macro, and use a new
flag MTX_COLD when calling mtx_init. This is still somewhat hackish,
but it is less evil than the mtx_f filler struct, and the mtx struct is
now the same size with and without mutex debugging code.
- Add some micro-micro-operation macros for doing the actual atomic
operations on the mutex mtx_lock field to make it easier for other archs
to override/optimize mutex ops if needed. These new tiny ops also clean
up the code in some places by replacing long atomic operation function
calls that spanned 2-3 lines with a short 1-line macro call.
- Don't call mi_switch() from mtx_enter_hard() when we block while trying
to obtain a sleep mutex. Calling mi_switch() would bogusly release
Giant before switching to the next process. Instead, inline most of the
code from mi_switch() in the mtx_enter_hard() function. Note that when
we finally kill Giant we can back this out and go back to calling
mi_switch().


67158 15-Oct-2000 phk

Move DELAY() from <machine/clock.h> to <sys/systm.h>


66716 06-Oct-2000 jhb

- Change fast interrupts on x86 to push a full interrupt frame and to
return through doreti to handle ast's. This is necessary for the
clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
This is needed because both hardclock() and statclock() need to run in
the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
during the clock interrupt. Instead, use two p_flag's in the proc struct
to mark the current process as having a pending SIGVTALRM or a SIGPROF
and let them be delivered during ast() when hardclock() has finished
running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was
broken on the x86 if it was turned on since cpl is gone. It's only use
was to bogusly run softclock() directly during hardclock() rather than
scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
mutex. Since the spin mutex already handles disabling/restoring
interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
signals in hardclock() that are to be delivered in ast().

Submitted by: jakeb (making statclock safe in a fast interrupt)
Submitted by: cp (concept of delaying signals until ast())


66698 05-Oct-2000 jhb

- Heavyweight interrupt threads on the alpha for device I/O interrupts.
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.

Submitted by: dfr


66614 04-Oct-2000 jasone

Reduce userland namespace polution.


66573 03-Oct-2000 dfr

Clear pcb_schednest in cpu_fork() for the child process. This is
is necessary since the child's call stack only includes one recursive
hold of sched_lock.


66280 23-Sep-2000 jasone

#include <sys/proc.h> in order to get curproc. This seems to be the lesser
of two evils; the greater evil is requiring sys/proc.h to be included
before including machine/mutex.h.


66277 22-Sep-2000 ps

Remove the NCPU, NAPIC, NBUS, NINTR config options. Make NAPIC,
NBUS, NINTR dynamic and set NCPU to a maximum of 16 under SMP.

Reviewed by: peter


65856 14-Sep-2000 jhb

Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just
use struct mtx, struct witness, and struct witness_blessed.

Requested by: bde


65821 13-Sep-2000 jhb

- Fix spinlock exiting to handle recursion properly and only enable
interrupts at the proper time.
- Remove an uneeded test and just always set the MTX_RECURSE bit when
recursing on a sleep lock.


65789 12-Sep-2000 dfr

Really disable interrupts for spin mutexes instead of just pretending.


65727 11-Sep-2000 jhb

The alpha doesn't have a eflags register, so don't refer to it here.


65651 09-Sep-2000 jasone

Style cleanups. No functional changes.


65650 09-Sep-2000 jasone

Add file and line arguments to WITNESS_ENTER() and WITNESS_EXIT, since
__FILE__ and __LINE__ don't get expanded usefully in inline functions.

Add const to all witness*() arguments that are filenames.


65628 09-Sep-2000 jhb

Add missing \'s to multline macros used for assertions.


65623 08-Sep-2000 jasone

Use inline functions instead of macros for mtx_enter(), mtx_try_enter(),
and mtx_exit(). This change tracks the i386 version.

Rename mtx_enter(), mtx_try_enter(), and mtx_exit() and wrap them with cpp
macros that expand to pass filename and line number information. This is
necessary since we're using inline functions instead of macros now.

Add const to the filename pointers passed througout the mtx and witness
code.


65557 07-Sep-2000 jasone

Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).

Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh


61825 19-Jun-2000 gallatin

Support bounce buffers for ISA DMA on the alpha. This is required for the
irongate chipset (used in the UP1000) which does not support scatter/gather
DMA. We'll still use scatter gather if the core logic chipset supports it.

Reviewed by: dfr


61540 11-Jun-2000 alc

cpu_fork(): Check "flags" before dereferencing "p2". Otherwise,
the call "vm_fork(p1, 0, flags);" early in fork1 can cause a kernel
panic.


60334 10-May-2000 jhb

Handle PCI devices that actually use an ISA IRQ for the cia and tsunami
chipsets. An example of this is the USB controller on these chipsets.
With this, I can now use USB devices on the test Alpha I am borrowing at
the moment.

Reviewed by: dfr, obrien


60041 05-May-2000 phk

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


58345 20-Mar-2000 phk

Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd. The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue. It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users: Greg has not had time to test this yet, be careful.


57325 18-Feb-2000 sos

Update the ata driver to take more advantage of newbus, this
was needed to make attach/detach of devices work, which is
needed for the PCCARD support.
(PCCARD support is still not working though, more to come on that)

Support the CMD646 chip which is used on many alphas, sadly only
in WDMA2 mode, as the silicon is broken beyond belief for UDMA modes.

Lots of cosmetic fixes here and there.

Sorry for the size of this megapatchfromhell but it was not
possible otherwise...

newbus patches based on work from: dfr (Doug Rabson)


56096 16-Jan-2000 gallatin

The kernel side of per-process unaligned access control (printing, fixing &
delivering SIGBUS). This will allow a non-superuser to control unaligned
access behaviour on a per-process basis once a userland control program
(uac) is written.

Reviewed by: obrien
Tested by: obrien


55612 08-Jan-2000 marcel

Sync with i386

\begin{quote}
Compile genassym.c with ordinary ${CFLAGS}. The (small) needs for
${GEN_CFLAGS} and -U_KERNEL became negative when all all the
genassym.c's were converted to be cross-built.

Makefile.*:
- Cleanups associated with the old genassym.
- Fixed deprecated spelling of ${.IMPSRC} as "$<".
\end{quote}

Submitted by: bde


55571 07-Jan-2000 marcel

Use genassym(1).


55534 07-Jan-2000 peter

Revert back all the way to 1.11 - the problem was that Makefile.alpha was
out of sync.


55526 07-Jan-2000 msmith

Don't include <sys/systm.h>. It doesn't do anything, and with recent
changes it breaks building genassym.


55329 03-Jan-2000 mjacob

untangle some includes and clean up for compilation cleanliness.


55205 29-Dec-1999 peter

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


54207 06-Dec-1999 peter

Make this compile again. (missing #include for RFPROC)


54188 06-Dec-1999 luoqi

User ldt sharing.


53593 22-Nov-1999 peter

Use %ll instead of %q as gcc moans bitterly about it.


53086 10-Nov-1999 dfr

Re-organise the code which manages the owner of the FP state (fpcurproc).
The old code was spread out through the machdep code and was sloppy about
enabling and disabling the FEN bit (which controls access to the FP
register set). This caused a DIAGNOSTIC warning "DANGER WILL ROBINSON:
FEN SET IN cpu_fork!" sometimes when operating under high loads and could
conceivably lead to processes getting incorrect FP results.

The new code is much more strict about the FEN bit and makes sure that
*only* fpcurproc ever has it enabled. This also allows us to remove a
section of code from the exception_return path which might improve
performance marginally.

Reviewed by: gallatin


52647 30-Oct-1999 alc

The core of this patch is to vm/vm_page.h. The effects are two-fold: (1) to
eliminate an extra (useless) level of indirection in half of the page
queue accesses and (2) to use a single name for each queue throughout,
instead of, e.g., "vm_page_queue_active" in some places and
"vm_page_queues[PQ_ACTIVE]" in others.

Reviewed by: dillon


52635 29-Oct-1999 phk

useracc() the prequel:

Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.


52243 14-Oct-1999 dfr

* Implement bus_set/get/delete_resource for pci.
* Change the hack used on the alpha for mapping devices into DENSE or
BWX memory spaces to a simpler one. Its still a hack and should be
a seperate api to explicitly map the resource.
* Add $FreeBSD$ as necessary.


51792 29-Sep-1999 marcel

sigset_t change (part 3 of 5)
-----------------------------

By introducing a new sigframe so that the signal handler operates
on the new siginfo_t and on ucontext_t instead of sigcontext, we
now need two version of sendsig and sigreturn.

A flag in struct proc determines whether the process expects an
old sigframe or a new sigframe. The signal trampoline handles
which sigreturn to call. It does this by testing for a magic
cookie in the frame.

The alpha uses osigreturn to implement longjmp. This means that
osigreturn is not only used for compatibility with existing
binaries. To handle the new sigset_t, setjmp saves it in
sc_reserved (see NOTE).

the struct sigframe has been moved from frame.h to sigframe.h
to handle the complex header dependencies that was caused by
the new sigframe.

NOTE: For the i386, the size of jmp_buf has been increased to hold
the new sigset_t. On the alpha this has been prevented by
using sc_reserved in sigcontext.


51474 20-Sep-1999 dillon

Fix bug in pipe code relating to writes of mmap'd but illegal address
spaces which cross a segment boundry in the page table. pmap_kextract()
is not designed for access to the user space portion of the page
table and cannot handle the null-page-directory-entry case.

The fix is to have vm_fault_quick() return a success or failure which
is then used to avoid calling pmap_kextract().


50477 28-Aug-1999 peter

$Id$ -> $FreeBSD$


50458 27-Aug-1999 gallatin

Fix the child's return path from fork so that fork will return 0
in the child. This corrects a problem where linux/alpha binaries see
the child's return value of fork as the parent's pid. This happens because
linux/alpha binaries apparently check the return value directly, rather
than looking for a non-zero value in a4, as *BSD & OSF/1 do.

Reviewed by:dfr@nlsystems.com


50085 20-Aug-1999 gallatin

Fix a nasty kld bug where modules with objects of type GLOB_DAT which had
non-zero addends were being loaded incorrectly


50037 19-Aug-1999 peter

Update for MI switch code, and trim a heap of unused (I believe) entries.


49444 05-Aug-1999 jdp

Sync with alc's revision 1.125 of i386/i386/vm_machdep.c. This
fixes the kernel build breakage.


49157 28-Jul-1999 dfr

Add support for SYS_RES_DENSE and SYS_RES_BWX resource types. These are
equivalent to SYS_RES_MEMORY for x86 but for alpha, the rman_get_virtual()
address of the resource is initialised to point into either dense-mapped
or bwx-mapped space respectively, allowing direct memory pointers to be
used to device memory.

Reviewed by: Andrew Gallatin <gallatin@cs.duke.edu>


48974 22-Jul-1999 alc

Reduce the number of "magic constants" used for page coloring
by one: PQ_PRIME2 and PQ_PRIME3 are used to accomplish the same
thing at different places in the kernel. Drop PQ_PRIME3.


48840 16-Jul-1999 dfr

Handle R_ALPHA_NONE relocations in KLD.


48713 09-Jul-1999 mjacob

Add in dbregs stubs that a committer for changes on the i386 ought to have done.
PR: 12579


48691 09-Jul-1999 jlemon

Implement support for hardware debug registers on the i386.

Submitted by: Brian Dean <brdean@unx.sas.com>


48427 01-Jul-1999 peter

Add alpha_platform_setup_ide_intr() and alpha_platform_assign_pciintr()
prototypes.


48423 01-Jul-1999 peter

Declare busdma_swi() like on i386 version.


48391 01-Jul-1999 peter

Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface. kproc_start is still available
as a SYSINIT() hook. This allowed simplification of chunks of the
sysinit code in the process. This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process. It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit. This means that nfsiod
doesn't need to be in /sbin and is always "available". This is a fair bit
easier to do outside of the SYSINIT_KT() framework.


48305 28-Jun-1999 peter

Revert back to not using -DKERNEL


48244 26-Jun-1999 peter

Make genassym compile - the recent buf locking changes meant that more
things from #ifdef KERNEL were needed.


47869 10-Jun-1999 dt

Replace my previous fix of saving the FP state with a much simpler one: when
we swap out fpcurproc, save its FP state.

Suggested by: bde


47840 08-Jun-1999 dt

Keep fpcurproc locked in memory, so that we always can save the FP state
correctly.

This should fix the "pmap_changebit didn't" panic that some people see.

Reviewed by: dfr


47389 22-May-1999 bde

Fixed style bugs in previous commit.


47347 20-May-1999 ache

Set CHAR_{MIN,MAX} according to -funsigned-char flag given or not

PR: 11627
Submitted by: Petr Lampa <lampa@fee.vutbr.cz>


45958 23-Apr-1999 dt

Fixed several (not all) warnings.


45889 20-Apr-1999 dt

Added consts to cpu_set_fork_handler prototype. (Follow i386 version.)


45821 19-Apr-1999 peter

unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.


44327 28-Feb-1999 bde

Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu,
not per-process. Keep it in `switchtime' consistently.

It is now clear that the timestamp is always valid in fork_trampoline()
except when the child is running on a previously idle cpu, which
can only happen if there are multiple cpus, so don't check or set
the timestamp in fork_trampoline except in the (i386) SMP case.
Just remove the alpha code for setting it unconditionally, since
there is no SMP case for alpha and the code had rotted.

Parts reviewed by: dfr, phk


43758 08-Feb-1999 dillon

Adjust idle zero-page fill hysteresis based on tests. Use 2/3 and 4/5
zero-fill levels.

Adjust comment for ozfod in vmmeter.h - this counter represents
non-optimal ( on the fly ) zero fills, not prefills.


43753 08-Feb-1999 dillon

Add hysteresis to alpha version of vm_page_zero_idle().


43752 08-Feb-1999 dillon

Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in with
PQ_FREE. There is little operational difference other then the kernel
being a few kilobytes smaller and the code being more readable.

* vm_page_select_free() has been *greatly* simplified.
* The PQ_ZERO page queue and supporting structures have been removed
* vm_page_zero_idle() revamped (see below)

PG_ZERO setting and clearing has been migrated from vm_page_alloc()
to vm_page_free[_zero]() and will eventually be guarenteed to remain
tracked throughout a page's life ( if it isn't already ).

When a page is freed, PG_ZERO pages are appended to the appropriate
tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended.
When locating a new free page, PG_ZERO selection operates from within
vm_page_list_find() ( get page from end of queue instead of beginning
of queue ) and then only occurs in the nominal critical path case. If
the nominal case misses, both normal and zero-page allocation devolves
into the same _vm_page_list_find() select code without any specific
zero-page optimizations.

Additionally, vm_page_zero_idle() has been revamped. Hysteresis has been
added and zero-page tracking adjusted to conform with the other changes.
Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free
pages. We may wish to increase both parameters as time permits. The
hysteresis is designed to avoid silly zeroing in borderline allocation/free
situations.


43209 26-Jan-1999 julian

Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option. I was aware of this
|when I first did the work, but then forgot about it. The VM_STACK stuff
|has some code changes in the i386 branch. There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in. This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches. The vm_map_entry will then always have one
|extra element (avail_ssize). It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c. This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch. I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by: "Richard Seaman, Jr." <dick@tar.com>


42175 30-Dec-1998 dfr

Various changes to support OSF1 emulation:

* Move the user stack from VM_MAXUSER_ADDRESS to a place below the 32bit
boundary (needed to support 32bit OSF programs). This should also save
one pagetable per process.
* Add cvtqlsv to the set of instructions handled by the floating point
software completion code.
* Disable all floating point exceptions by default.
* A minor change to execve to allow the OSF1 image activator to support
dynamic loading.


41868 16-Dec-1998 bde

Removed bogus casts of USRSTACK and/or the other operand in binary
expressions involving USRSTACK.


41499 04-Dec-1998 dfr

Implement 'software completion' for floating point arithmetic. On the
alpha, operations involving non-finite numbers or denormalised numbers
or operations which should generate such numbers will cause an arithmetic
exception. For programs which follow some strict code generation rules,
the kernel trap handler can then 'complete' the operation by emulating
the faulting instruction.

To use software completion, a program must be compiled with the arguments
'-mtrap-precision=i' and '-mfp-trap-mode=su' or '-mfp-trap-mode=sui'.
Programs compiled in this way can use non-finite and denormalised numbers
at the expense of slightly less efficient code generation of floating
point instructions. Programs not compiled with these options will receive
a SIGFPE signal when non-finite or denormalised numbers are used or
generated.

Reviewed by: John Polstra <jdp@polstra.com>


41181 15-Nov-1998 dfr

* Add hooks to allow the X server to access I/O ports and memory.
* Update drivers to the latest version of the bus interface.

The ISA drivers' use of the new resource api is minimal. Garrett has
some much cleaner drivers which should be more easily shared between
i386 and alpha. This has only been tested on cia based machines. It
should work on lca and apecs but I might have broken something.


40751 30-Oct-1998 msmith

Add the ability to specify where on the at_shutdown queue a handler is
installed.

Remove cpu_power_down, and replace it with an entry at the end of the
SHUTDOWN_FINAL queue in the only place it's used (APM).

Submitted by: Some ideas from Bruce Walter <walter@fortean.com>


40713 29-Oct-1998 wollman

A small fragment of new ISA framework: manifest constants for the resources
implemented by the i386 root nexus.


40517 18-Oct-1998 dfr

R_ALPHA_RELATIVE relocations need to add the value to the existing memory
contents.


40435 16-Oct-1998 peter

*gulp*. Jordan specifically OK'ed this..

This is the bulk of the support for doing kld modules. Two linker_sets
were replaced by SYSINIT()'s. VFS's and exec handlers are self registered.
kld is now a superset of lkm. I have converted most of them, they will
follow as a seperate commit as samples.
This all still works as a static a.out kernel using LKM's.


40377 15-Oct-1998 dfr

Change a bogus cast to the correct one.


40342 14-Oct-1998 peter

Typo fix.


40341 14-Oct-1998 peter

Initial attempt to update the Alpha loader and kernel to use the machine
independent elf loader and have access to kld modules. Jordan and I were
not sure how to create boot floppies, and the things we tried just made
SRM laugh in our faces - but it was upset at boot1 which was not touched
by these changes. Essentially this has been untested. :-(

What this does is to steal the last three slots from the nine spare longs
in the bootinfo_v1 struct to pass the module base pointer through.

The startup code now to set up and fills in the module and environment
structures, hopefully close enough to the i386 layout to be able to use
the same kernel code. We now pass though the updated end of the kernel
space used, rather than _end. (like the i386).

If this does not work, it needs to be beaten into shape pronto. Otherwise
it should be backed out before 3.0.

Pre-approved in principle by: dfr


39995 06-Oct-1998 dfr

Add support for adjkerntz (largely untested).


39676 26-Sep-1998 dfr

Automatically detect which disk was booted and change the root to that disk.


39197 14-Sep-1998 jdp

Add new functions fill_fpregs() and set_fpregs(), like fill_regs()
and set_regs() but for the floating point register state. The code
is stolen from procfs_machdep.c, and moved out of there into
machdep.c.

These functions are needed for generating ELF core dumps.


39072 11-Sep-1998 dfr

Machine dependant relocations for KLD.


37832 22-Jul-1998 dfr

Add declaration of {aquire,release}_timer2().


37598 12-Jul-1998 dfr

Overhaul the spl system so that it actually works properly.


37597 12-Jul-1998 dfr

Don't bother calling pmap_emulate_reference() from cpu_fork(). It isn't
needed and it panics a DIAGNOSTIC kernel.


37585 12-Jul-1998 dfr

Add definition of p_switchtime.


37400 05-Jul-1998 dfr

Add declaration of the NetBSD/alpha bootinfo.


36972 14-Jun-1998 dfr

Major changes to the generic device framework for FreeBSD/alpha:

* Eliminate bus_t and make it possible for all devices to have
attached children.

* Support dynamically extendable interfaces for drivers to replace
both the function pointers in driver_t and bus_ops_t (which has been
removed entirely. Two system defined interfaces have been defined,
'device' which is mandatory for all devices and 'bus' which is
recommended for all devices which support attached children.

* In addition, the alpha port defines two simple interfaces 'clock'
for attaching various real time clocks to the system and 'mcclock'
for the many different variations of mc146818 clocks which can be
attached to different alpha platforms. This eliminates two more
function pointer tables in favour of the generic method dispatch
system provided by the device framework.

Future device interfaces may include:

* cdev and bdev interfaces for devfs to use in replacement for specfs
and the fixed interfaces bdevsw and cdevsw.

* scsi interface to replace struct scsi_adapter (not sure how this
works in CAM but I imagine there is something similar there).

* various tailored interfaces for different bus types such as pci,
isa, pccard etc.


36865 10-Jun-1998 dfr

Add missing copyrights. Thanks to Jason Thorpe for politely noting the
mistake...


36849 10-Jun-1998 dfr

Add initial support for the FreeBSD/alpha kernel. This is very much a
work in progress and has never booted a real machine. Initial
development and testing was done using SimOS (see
http://simos.stanford.edu for details). On the SimOS simulator, this
port successfully reaches single-user mode and has been tested with
loads as high as one copy of /bin/ls :-).

Obtained from: partly from NetBSD/alpha


36168 19-May-1998 tegge

Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by: David Greenman <dg@root.com>


22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


13611 24-Jan-1996 peter

Add commands for ptrace get/set registers.. (Same numbers as NetBSD)


6165 03-Feb-1995 bde

Don't define CLK_TCK here.

Uniformize idempotency ifdef.


1817 02-Aug-1994 dg

Added $Id$


1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


1543 25-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources


1334 04-Apr-1994 wollman

First pass at adding locale support. This code only deals with the LC_CTYPE
class of locale data, but could be extended to handle other locale
classes, as well as message catalogues and other non-locale i18n
support.

I have left the old _ctype_ array in place, and moved the ctype.h
header to octype.h, so that existing shared binaries will still be
able to find and use it as they require.

See /usr/src/share/locale for information on how to create new locale
data files (eventually this procedure will be improved). I'd like to
have a family of locale files for various countries, languages, and
character sets, so please contribute some.

This code was originally written by Paul Borman and contributed to
4.4; I did the integration, and have somewhat tested it. crt0.c
probably ought to call setlocale() if it doesn't already, but I'd like
for people to create some locale files and try things manually first
before I make every program do this.


1216 26-Feb-1994 ache

Bump CLK_TCK to more precise value (128)
If you want more precise, use directly getrusage(),
because clock() emulated via it.


880 19-Dec-1993 alm

adding libc/quad:
added _QUAD_HIGH/LOW
added (U_)QUAD_MAX/MIN
(from NetBSD)


719 07-Nov-1993 wollman

Made all header files idempotent and moved incorrect common data from
headers into a related source file. Added cons.h as first step towards
moving i386/i386/cons.h to machine/cons.h where it belongs.


621 16-Oct-1993 rgrimes

Removed all patch kit headers, sccsid and rcsid strings, put $Id$ in, some
minor cleanup. Added $Id$ to files that did not have any version info, etc


5 12-Jun-1993 rgrimes

This commit was generated by cvs2svn to compensate for changes in r4,
which included commits to RCS files with non-trunk default branches.