History log of /freebsd-current/sys/dev/mpr/mpr.c
Revision Date Author Comments
# 264610a8 14-Dec-2023 Kenneth D. Merry <ken@FreeBSD.org>

mpr, mps: Establish busdma boundaries for memory pools

Most all of the memory used by the cards in the mpr(4) and mps(4)
drivers is required, according to the specs and Broadcom developers,
to be within a 4GB segment of memory.

This includes:

System Request Message Frames pool
Reply Free Queues pool
ReplyDescriptorPost Queues pool
Chain Segments pool
Sense Buffers pool
SystemReply message pool

We got a bug report from Dwight Engen, who ran into data corruption
in the BAE port of FreeBSD:

> We have a port of the FreeBSD mpr driver to our kernel and recently
> I found an issue under heavy load where a DMA may go to the wrong
> address. The test system is a Supermicro X10SRH-CLN4F with the
> onboard SAS3008 controller setup with 2 enterprise Micron SSDs in
> RAID 0 (striped). I have debugged the issue and narrowed down that
> the errant DMA is one that has a segment that crosses a 4GB
> physical boundary. There are more details I can provide if you'd
> like, but with the attached patch in place I can no longer
> re-create the issue.

> I'm not sure if this is a known limit of the card (have not found a
> datasheet/programming docs for the chip) or our system is just
> doing something a bit different. Any helpful info or insight would
> be welcome.

> Anyway, just thought this might be helpful info if you want to
> apply a similar fix to FreeBSD. You can ignore/discard the commit
> message as it is my internal commit (blkio is our own tool we use
> to write/read every block of a device with CRC verification which
> is how I found the problem).

The commit message was:

> [PATCH 8/9] mpr: fix memory corrupting DMA when sg segment crosses
> 4GB boundary

> Test case was two SSD's in RAID 0 (stripe). The logical disk was
> then partitioned into two partitions. One partition had lots of
> filesystem I/O and the other was initially filled using blkio with
> CRCable data and then read back with blkio CRC verify in a loop.
> Eventually blkio would report a bad CRC block because the physical
> page being read-ahead into didn't contain the right data. If the
> physical address in the arq/segs was for example 0x500003000 the
> data would actually be DMAed to 0x400003000.

The original patch was against mpr(4) before busdma templates were
introduced, and only affected the buffer pool (sc->buffer_dmat) in
the mpr(4) driver. After some discussion with Dwight and the
LSI/Broadcom developers and looking through the driver, it looks
like most of the queues in the driver are ok, because they limit
the memory used to memory below 4GB. The buffer queue and the chain
frames seem to be the exceptions.

This is pretty much the same between the mpr(4) and mps(4) drivers.

So, apply a 4GB boundary limitation for the buffer and chain frame pools
in the mpr(4) and mps(4) drivers.

Reported by: Dwight Engen <dwight.engen@gmail.com>
Reviewed by: imp
Obtained from: Dwight Engen <dwight.engen@gmail.com>
Differential Revision: <https://reviews.freebsd.org/D43008>


# 7d154c4d 22-Feb-2023 Alan Somers <asomers@FreeBSD.org>

mprutil: "fix user reply buffer (64)..." warnings

Depending on the card's firmware version, it may return different length
responses for MPI2_FUNCTION_IOC_FACTS. But the first part of the
response contains the length of the rest, so query it first to get the
length and then use that to size the buffer for the full response.

Also, correctly zero-initialize MPI2_IOC_FACTS_REQUEST. It only worked
by luck before.

PR: 264848
Reported by: Julien Cigar <julien@perdition.city>
MFC after: 1 week
Sponsored by: Axcient
Reviewed by: scottl, imp
Differential Revision: https://reviews.freebsd.org/D38739


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 444c6615 21-Apr-2023 Mariusz Zaborski <oshogbo@FreeBSD.org>

mpr: don't use hardcoded value in debug branch

Pointed out by: imp
Sponsored by: Klara Inc.


# ea6597c3 21-Apr-2023 Mariusz Zaborski <oshogbo@FreeBSD.org>

mpr: fix copying of event_mask

Before the commit 6cc44223cb6717795afdac4348bbe7e2a968a07d the
field event_mask was fully copied to the EventMasks field.
After this commit the event_mask (uint8_t) is 4 times casted to
EventMask (uint32_t). Because of that 24 bits of each event_mask array
is lost.

This commits brings back simple copying of field, and after words
converting 32 bits field to the requested endian.

I don't think we need more sophisticated method,
as the array is of size 4 (for 32 bits version).

Reviewed by: imp
MFC after: 1 week
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D39562


# 11778fca 16-Oct-2022 Kenneth D. Merry <ken@FreeBSD.org>

Fix mpr(4) panic during a firmware update.

Issue Description:
The RequestCredits field of IOCFacts got changed between the Phase23
firmware to Phase24 firmware. So as part of firmware update operation,
driver has to free the resources & pools which are created with the Phase23
Firmware's IOCFacts data (i.e. during driver load time) and has to
reallocate the resources and pools using Phase24's IOCFacts data. Here
driver has freed the interrupts but missed to reallocate the interrupts and
hence config page read operation is getting timed out and controller is
going for recursive reinit (controller reset) operations and leading to
kernel panic.

Fix:
Reallocate the interrupts if the interrupts are disabled as part of
firmware update/downgrade operation.

Submitted by: Sreekanth Ready <sreekanth.reddy@broadcom.com>
Tested by: ken
MFC after: 3 days


# c5041b4e 28-Apr-2022 Warner Losh <imp@FreeBSD.org>

mpr/mps: Add comment explaining state transition

When we can't load a request due to a shortage of chains, we complete
the command's cm. However, to avoid an assert in mp?_complete_command,
we transition its state to INQUEUE. This transition is legitimate
because this is the only error path that terminates a cm before it's
enqueued and the only other alternative would be an additional transient
state that would add complexity w/o adding value. Add a comment
explainging all this because otherwise the transition can look a bit
weird.

Sponsored by: Netflix


# 1849bc5f 07-Jan-2022 Alexander Motin <mav@FreeBSD.org>

mps/mpr: Relax doorbell polling precision.

It does not matter how often do we check firmware for crashes.

MFC after: 2 weeks


# ddeb702f 30-Nov-2021 Gordon Bergling <gbe@FreeBSD.org>

mpr(4): Fix a typo in a source code comment

- s/segement/segment/

MFC after: 3 days


# b776de67 10-Aug-2021 Alexander Motin <mav@FreeBSD.org>

Mark some sysctls as CTLFLAG_MPSAFE.

MFC after: 2 weeks


# 33755dbb 03-Jun-2021 Warner Losh <imp@FreeBSD.org>

mpr/mps: Minor state machine fix

When a DMA chain can't be loaded, set the state to STATE_INQUEUE so that
the mp[rs]_complete_command can properly fail the command.

Sponsored by: Netflix


# 175ad3d0 03-Jun-2021 Kenneth D. Merry <ken@FreeBSD.org>

Fix mpr(4) and mps(4) state transitions and a use-after-free panic.

When the mpr(4) and mps(4) drivers probe a SATA device, they issue an
ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the
target is fully setup in the driver. The drivers wait for completion of
the identify command, and have a 5 second timeout. If the timeout
fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be
freed later.

That is where the use-after-free problem comes in. Once the ATA
Identify times out, the driver sends a target reset, and then frees any
identify commands that have timed out. But, once the target reset
completes, commands that were queued to the drive are returned to the
driver by the controller.

At that point, the driver (in mp{s,r}_intr_locked()) looks up the
command descriptor for that particular SMID, marks it CM_STATE_BUSY and
sends it on for completion handling.

The problem at this stage is that the command has already been freed,
and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS
are turned on, we get a panic as soon as this command is allocated,
because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY.

So, the solution is to not free ATA Identify commands that get stuck
until they actually return from the controller. Hopefully this works
correctly on older firmware versions. If not, it could result in
commands hanging around indefinitely. But, the alternative is a
use-after-free panic or assertion (in the INVARIANTS case).

This also tightens up the state transitions between CM_STATE_FREE,
CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen
once, and we have assertions to make sure that commands are in the
correct state before transitioning to the next state. Also, for each
state assertion, we print out the current state of the command if it is
incorrect.

mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes,
that controls the behavior of the dump_reqs sysctl.
If dump_reqs_alltypes is non-zero, it will dump
all commands, not just the commands that are in the
CM_STATE_INQUEUE state. (You can see the commands
that are in the queue by using mp{s,r}util debug
dumpreqs.)

Make sure that the INQUEUE -> BUSY state transition
happens in one place, the mp{s,r}_complete_command
routine.

mp{s,r}_sas.c: Make sure we print the current command type in
command state assertions.

mp{s,r}_sas_lsi.c:
Add a new completion handler,
mp{s,r}sas_ata_id_complete. This completion
handler will free data allocated for an ATA
Identify command and free the command structure.

In mp{s,r}_ata_id_timeout, do not set the command
state to CM_STATE_BUSY. The command is still in
queue in the controller. Since we were blocking
waiting for this command to complete, there was
no completion handler previously. Set the
completion handler, so that whenever the command
does come back, it will get freed properly.

Do not free ATA Identify commands that have timed
out in mp{s,r}sas_add_device(). Wait for them
to actually come back from the controller.

mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new
dump_reqs_alltypes sysctl.

Make sure we print the current state for state
transition asserts.

This was tested in the Spectra Logic test bed (as described in the
review), as well Netflix's Open Connect fleet (where panics dropped from
a dozen or two a month to zero).

Reviewed by: imp@ (who is handling the commit with ken's OK)
Sponsored by: Spectra Logic
Differential Revision: https://reviews.freebsd.org/D25476


# 71900a79 02-Mar-2021 Alfredo Dal'Ava Junior <alfredo@FreeBSD.org>

mpr: big-endian support

This fixes mpr driver on big-endian devices.
Tested on powerpc64 and powerpc64le targets using a SAS9300-8i card
(LSISAS3008 pci vendor=0x1000 device=0x0097)

Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25785


# cd853791 27-Nov-2020 Konstantin Belousov <kib@FreeBSD.org>

Make MAXPHYS tunable. Bump MAXPHYS to 1M.

Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225


# 4bc604dc 13-Oct-2020 Scott Long <scottl@FreeBSD.org>

Bring the request_descriptor union into harmony internally. No
functional change.


# 74c781ed 13-Sep-2020 Scott Long <scottl@FreeBSD.org>

Refine the busdma template interface. Provide tools for filling in fields
that can be extended, but also ensure compile-time type checking. Refactor
common code out of arch-specific implementations. Move the mpr and mps
drivers to this new API. The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.


# 577858c8 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

mpr: clean up empty lines in .c and .h files


# d2a5f081 27-Jul-2020 Mark Johnston <markj@FreeBSD.org>

mpr(4), mps(4): Stop checking for failures from malloc(M_WAITOK).

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org>
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25766


# 0d87f3c7 26-Feb-2020 Warner Losh <imp@FreeBSD.org>

Remove support for all pre FreeBSD 11.0 versions from mpr and mps.

Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10
was branched over 6 years ago. All of these changes date from about that time or
earlier. These workarounds are extensive and get in the way of understanding
the current flow in the driver.


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 69e85eb8 06-Feb-2020 Scott Long <scottl@FreeBSD.org>

Advertise the MPI Message Version that's contained in the IOCFacts message
in the sysctl block for the driver. mpsutil/mprutil needs this so it can
know how big of a buffer to allocate when requesting the IOCFacts from the
controller. This eliminates the kernel console messages about wrong
allocation sizes.

Reported by: imp


# f5ead205 24-Dec-2019 Scott Long <scottl@FreeBSD.org>

Convert the mpr driver to use busdma templates.


# 4b1ac5c2 11-Jul-2019 Warner Losh <imp@FreeBSD.org>

More fully implement the state machine.

When a command is finished running, we must transition it from INQUEUE
to busy state. We were failing to do that, so we hit a panic when the
commands were freed. This only affects mpr, mps already did simmilar
things. Now both the polling and interrupt paths properly set BUSY as
appropriate.


# 8fe7bf06 08-Jul-2019 Warner Losh <imp@FreeBSD.org>

Fix bugs in recovery path and improve cm tracking

Eliminate the TIMEDOUT state. This state really conveyed two different
concepts: I timed out during recovery (and my command got put on the
recovery queue), and I timed out diring discovery (which doesn't).
Separate those two concepts into two flags. Use the TIMEDOUT flag to
fail requests as timed out. Use the on queue flag to remove them from
the queue.

In mps_intr_locked for MPI2_RPY_DESCRIPT_FLAGS_ADDRESS_REPLY message
type, when completing commands, ignore the ones that are not in state
INQUEUE. They were already completed as part of the recovery
process. When we complete them twice, we wind up with entries on the
free queue that are marked as busy, trigging asserts.

Reviewed by: scottl (earlier version, just for mpr)
Differential Revision: https://reviews.freebsd.org/D20785


# 37b807ff 24-Mar-2019 Scott Long <scottl@FreeBSD.org>

r329522 created problemss with commands that enter the TIMEDOUT state but
are successfully returned by the card (usually due to an abort being issued
as part of timeout recovery). Remove what amounts to an insufficient
KASSERT, and don't overwrite the state value. State should probably be
re-designed, and that will be done with a future commit.

Reported by: phk, bei.io
Reviewed by: imp, mav
Differential Revision: D19677


# 46b23587 26-Dec-2018 Kashyap D Desai <kadesai@FreeBSD.org>

Update copyright information

Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc


# 34c5490d 26-Dec-2018 Kashyap D Desai <kadesai@FreeBSD.org>

Enable atomic type descriptor support only for Sea & Aero cards

Enable atomic type descriptor support only for Sea & Aero cards,
due to HW errata this atomic descriptor support has to be disabled
on Ventura cards.

Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc


# 86312e46 21-Dec-2018 Conrad Meyer <cem@FreeBSD.org>

mps(4), mpr(4): remove SATA ID command cancellation hack

Add a generic mechanism to override mp?_wait_command's timeout behavior,
which continues to invoke reinit by default. Invokers who set
cm_timeout_handler may avoid automatic reinit and do their own handling.

Adapt mp?sas_get_sata_identify to this mechanism and remove its callout
hack.

Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18614


# 617e85f3 08-Dec-2018 Scott Long <scottl@FreeBSD.org>

Copy and clear the reply descriptor atomically. This prevents concurrency
in the interrupt handlers (usually due to timeout/error recovery) from
seeing and processing the same descriptor twice.


# cf6ea6f2 11-Mar-2018 Scott Long <scottl@FreeBSD.org>

Implement a sysctl to dump in-flight I/O state for debugging. The tool to
parse it will be committed in a separate action.

Sponsored by: Netflix


# 731308d0 26-Feb-2018 Alexander Motin <mav@FreeBSD.org>

Allow physically non-contiguous chain frames allocation in mps(4)/mpr(4).

Chain frames required to satisfy all 2K of declared I/Os of 128KB each take
more then a megabyte of a physical memory, all of which existing code tries
allocate as physically contiguous. This patch removes that physical
contiguousness requirement, leaving only virtual contiguousness. I was
thinking about other ways of allocation, but the less granular allocation
becomes, the bigger is the overhead and/or complexity, reaching about 100%
overhead if allocate each frame separately.

The patch also bumps the chain frames hard limit from 2K to 16K. It is more
than enough for the case of default REQ_FRAMES and MAXPHYS (the drivers will
allocate less than that automatically), while in case of increased MAXPHYS
it will control maximal memory usage.

Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14420


# f0779b04 18-Feb-2018 Scott Long <scottl@FreeBSD.org>

Improve command lifecycle debugging and detection of problems.

Sponsored by: Netflix


# 92ddc7b8 13-Feb-2018 Li-Wen Hsu <lwhsu@FreeBSD.org>

Fix non-64-bit platform build by printing bus_addr_t values using %#jx

Reviewed by: slm
Differential Revision: https://reviews.freebsd.org/D14344


# 63b1a335 11-Feb-2018 Scott Long <scottl@FreeBSD.org>

Print out the shared memory queues during initialization

Sponsored by: Netflix


# 4f5d6573 09-Feb-2018 Alexander Motin <mav@FreeBSD.org>

Teach mps(4) and mpr(4) drivers to autotune chain frames.

This is a first part of the change. It makes the drivers to calculate
the required number of chain frames to satisfy worst case scenarios, but
it does not change existing overly strict limits on them. The next step
will be to rewrite the allocator to not require megabytes of physically
contiguous address space, that may be problematic if done after boot,
after doing which the limits can be removed. Until that this code can
just correct user set limits, if they are set too high.

Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14261


# 96410703 06-Feb-2018 Scott Long <scottl@FreeBSD.org>

Cache the value of the request and reply frame size since it's used quite
a bit in the normal operation of the driver. Covert it to represent bytes
instead of 32bit words. Fix what I believe to be is a bug in this respect
with the Tri-mode cards.

Sponsored by: Netflix


# 62a09ee9 06-Feb-2018 Alexander Motin <mav@FreeBSD.org>

Fix queue length reporting in mps(4) and mpr(4).

Both drivers were found to report CAM bigger queue depth then they really
can handle. It made them later under high load with many disks return
some of submitted requests back with CAM_REQUEUE_REQ status for later
resubmission.

Reviewed by: scottl
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14215


# e2997a03 06-Feb-2018 Kenneth D. Merry <ken@FreeBSD.org>

Diagnostic buffer fixes for the mps(4) and mpr(4) drivers.

In mp{r,s}_diag_register(), which is used to register diagnostic
buffers with the mp{r,s}(4) firmware, we allocate DMAable memory.

There were several issues here:
o No checking of the bus_dmamap_load() return value. If the load
failed or got deferred, mp{r,s}_diag_register() continued on as if
nothing had happened. We now check the return value and bail
out if it fails.

o No waiting for a deferred load callback. bus_dmamap_load()
calls a supplied callback when the mapping is done. This is
generally done immediately, but it can be deferred.
mp{r,s}_diag_register() did not check to see whether the callback
was already done before proceeding on. We now sleep until the
callback is done if it is deferred.

o No call to bus_dmamap_sync(... BUS_DMASYNC_PREREAD) after the
memory is allocated and loaded. This is necessary on some
platforms to synchronize host memory that is going to be updated
by a device.

Both drivers would also panic if the firmware was reinitialized while
a diagnostic buffer operation was in progress. This fixes that problem
as well. (The driver will reinitialize the firmware in various
circumstances, but the problem I ran into was that the firmware would
generate an IOC Fault due to a PCIe error.)

mp{r,s}var.h:
Add a new structure, struct mpr_busdma_context, that is
used for deferred busdma load callbacks.

Add a prototype for mp{r,s}_memaddr_wait_cb().
mp{r,s}.c:
Add a new busdma callback function, mp{r,s}_memaddr_wait_cb().
This provides synchronization for callers that want to
wait on a deferred bus_dmamap_load() callback.

mp{r,s}_user.c:
In bus_dmamap_register(), add a call to bus_dmamap_sync()
with the BUS_DMASYNC_PREREAD flag set after an allocation
is loaded.

Also, check the return value of bus_dmamap_load(). If it
fails, bail out. If it is EINPROGRESS, wait for the
callback to happen. We use an interruptible sleep (msleep
with PCATCH) and let the callback clean things up if we get
interrupted.

In mpr_diag_read_buffer() and mps_diag_read_buffer(), call
bus_dmamap_sync(..., BUS_DMASYNC_POSTREAD) before copying
the data out to make sure the data is in stable storage.

In mp{r,s}_post_fw_diag_buffer() and
mp{r,s}_release_fw_diag_buffer(), check the reply to see
whether it is NULL. It can be NULL (and the command non-NULL)
if the controller gets reinitialized while we're waiting for
the command to complete but the driver structures aren't
reallocated. The driver structures generally won't be
reallocated unless there is a firmware upgrade that changes
one of the IOCFacts.

When freeing diagnostic buffers in mp{r,s}_diag_register()
and mp{r,s}_diag_unregister(), zero/NULL out the buffer after
freeing it. This will prevent a duplicate free in some
situations.

Sponsored by: Spectra Logic
Reviewed by: mav, scottl
MFC after: 1 week
Differential Revision: D13453


# 77baa225 04-Feb-2018 Justin Hibbits <jhibbits@FreeBSD.org>

Minimal changes for MPR to build on architectures with physical addresses larger than virtual

Summary:
Some architectures use large (36-bit) physical addresses, with smaller
virtual addresses. Casting between vm_paddr_t (or bus_addr_t) and void * is
considered illegal, so cast through uintptr_t. No functional change on existing
platforms.

Reviewed By: scottl
Differential Revision: https://reviews.freebsd.org/D14042


# ac2fffa4 21-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by: wosch
PR: 225197


# 26c1d774 13-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

dev: make some use of mallocarray(9).

Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.


# 10695417 10-Nov-2017 Scott Long <scottl@FreeBSD.org>

Refactoring the interrupt setup code introduced a bug where the drivers
would attempt to re-allocate interrupts during a chip reset without
first de-allocating them. Doing that right is going to be tricky, so
just band-aid it for now so that a re-init doesn't guarantee a failure
due to resource re-use.

Reported by: gallatin
Sponsored by: Netflix


# cfd6fd5a 01-Oct-2017 Scott Long <scottl@FreeBSD.org>

Improve the debug parsing to allow flags to be added and subtracted
from the existing set.

Submitted by: rea@freebsd.org


# cb242d7c 28-Sep-2017 Scott Long <scottl@FreeBSD.org>

Convert sysctl sbuf usage to use a fully dynaic sbuf. This is strictly
needed, but it silences an erroneous Coverity warning and makes the code a
little more logically consistent. Also mark the sysctl as MPSAFE.

Sponsored by: Netflix


# 867aa8cd 24-Sep-2017 Scott Long <scottl@FreeBSD.org>

Add the ability to report and set debug flags as text strings instead of
just integer flags. Report both for convenience.

Submitted by: Eygene Ryabinkin (manpage)
Sponsored by: Netflix


# 3c5ac992 10-Sep-2017 Scott Long <scottl@FreeBSD.org>

Add infrastructure for allocating multiple MSI-X interrupts. Also
add more fine-tuned controls for allocating requests and replies.

Sponsored by: Netflix


# a4bb51a4 10-Sep-2017 Scott Long <scottl@FreeBSD.org>

Fix intrhook release in MPR and MPS for EARLY_AP_STARTUP.

Reported by: Limelight
Sponsored by: Netflix


# 1415db6c 09-Sep-2017 Scott Long <scottl@FreeBSD.org>

More code refactoring in preparation for enabling multiqueue.

Sponsored by: Netflix


# bec09074 09-Sep-2017 Scott Long <scottl@FreeBSD.org>

Start separating the LSI drivers into per-queue structures. No
functional change.

Sponsored by: Netflix


# 757ff642 27-Aug-2017 Scott Long <scottl@FreeBSD.org>

Start overhauling debug printing in the MPS and MPR drivers. The focus of this
commit it to make initiazation less chatty in the normal case, and more useful
and informative when real debugging is turned on.

Reviewed by: ken (earlier version)
Sponsored by: Netflix


# 6d4ffcb4 10-Aug-2017 Kenneth D. Merry <ken@FreeBSD.org>

Changes to make mps(4) and mpr(4) handle reinit with reallocation.

When the mps(4) and mpr(4) drivers need to reinitialize the
firmware, they sometimes need to reallocate all of the memory
allocated by the driver. The reallocation happens whenever the IOC
Facts change. That should only happen after a firmware upgrade.

If the reinitialization happens as a result of a timed out command
sent to the card, the command that timed out and triggered the
reinit may have been freed if iocfacts_allocate() reallocated all
memory. If the caller attempts to access the command after that,
the kernel will panic because the caller will be dereferencing
freed memory.

The solution is to set a flag in the softc when we reallocate,
and avoid dereferencing the command strucure if we've reallocated.

The changes are largely the same in both drivers, since mpr(4) is a
derivative of mps(4).

o In iocfacts_allocate(), if the IOC Facts have changed and we
need to reallocate, set the REALLOCATED flag in the softc.

o Change wait_command() to take a struct mps_command ** instead of
a struct mps_command *. This allows us to NULL out the caller's
command pointer if we have to reinit the controller and the data
structures get reallocated. (The REALLOCATED flag will be set
in the softc if that has happened.)

o In every place that calls wait_command(), make sure we handle
the case where the command is NULL after the call.

o The mpr(4) driver has mpr_request_polled() which can also
reinitialize the card. Also check for reallocation there.

Reviewed by: scottl, slm
MFC after: 1 week
Sponsored by: Spectra Logic


# 055e2653 30-Jul-2017 Scott Long <scottl@FreeBSD.org>

Change from using underbar function names to normal function names for
the informational print functions. Collapse the debug API a bit to be
more generic and not require as much code duplication. While here, fix
a bug in MPS that was already fixed in MPR.


# 252b2b4f 30-Jul-2017 Scott Long <scottl@FreeBSD.org>

Split the interrupt setup code into two parts: allocation and configuration.
Do the allocation before requesting the IOCFacts message. This triggers
the LSI firmware to recognize the multiqueue should be enabled if available.
Multiqueue isn't used by the driver yet, but this also fixes a problem with
the cached IOCFacts not matching latter checks, leading to potential problems
with error recovery.

As a side-effect, fetch the driver tunables as early as possible.

Reviewed by: slm
Obtained from: Netflix
Differential Revision: D9243


# 417aa6b8 19-Jul-2017 Kenneth D. Merry <ken@FreeBSD.org>

Fix spurious timeouts on commands sent to mps(4) and mpr(4) controllers.

mps_wait_command() and mpr_wait_command() were using getmicrotime() to
determine elapsed time when checking for a timeout in polled mode.
getmicrotime() isn't guaranteed to monotonically increase, and that
caused spurious timeouts occasionally.

Switch to using getmicrouptime(), which does increase monotonically.
This fixes the spurious timeouts in my test case.

Reviewed by: slm, scottl
MFC after: 3 days
Sponsored by: Spectra Logic


# 327f2e6c 25-May-2017 Stephen McConnell <slm@FreeBSD.org>

Fix several problems with mapping code.

Reviewed by: ken, scottl, asomers, ambrisko, mav
Approved by: ken, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10861


# 67feec50 17-May-2017 Stephen McConnell <slm@FreeBSD.org>

Add tri-mode support (SAS/SATA/PCIe).

This includes NVMe device support and adds support for the following adapters:
SAS 3408
SAS 3416
SAS 3508
SAS 3516
SAS 3616
SAS 3708
SAS 3716

Reviewed by: ken, scottl, asomers, mav
Approved by: ken, scottl, mav
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D10095


# 4ab1cdc5 02-Nov-2016 Scott Long <scottl@FreeBSD.org>

Add a fallback to the device mapper logic. We've seen systems in the field
that are apparently misconfigured by the manufacturer and cause the mapping
logic to fail. The fallback allows drive numbers to be assigned based on the
PHY number that they're attached to. Add sysctls and tunables to overrid
this new behavior, but they should be considered only necessary for debugging.

Reviewed by: imp, smh
Obtained from: Netflix
MFC after: 3 days
Sponsored by: D8403


# 32b0a21e 12-Jul-2016 Stephen McConnell <slm@FreeBSD.org>

Use real values to calculate Max I/O size instead of guessing.

Reviewed by: ken, scottl
Approved by: ken, scottl, ambrisko (mentors)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7043


# b41c6ff9 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Change logging level for a debug string to use MPR_LOG instead of MPR_INFO.

Approved by: ken, scottl, ambrisko
MFC after: 1 week


# d3f6eabf 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

No log bit in IOCStatus and endian-safe changes.

Use MPI2_IOCSTATUS_MASK when checking IOCStatus to mask off the log bit, and
make a few more things endian-safe.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6097


# 2bbc5fcb 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Add support for the Broadcom (Avago/LSI) 9305 16 and 24 port HBA's.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D6098


# 7a2a6a1a 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Several style changes and add copyrights for 2016.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6103


# 453130d9 02-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: minor spelling fixes.

Most affect comments, very few have user-visible effects.


# d9c9c81c 21-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: use our roundup2/rounddown2() macros when param.h is available.

rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.


# d0be3479 16-Oct-2015 Scott Long <scottl@FreeBSD.org>

Remove _FreeBSD_version check for something that was only an issue with
9-CURRENT.

Obtained from: Netlfix, Inc
MFC after: 3 days


# a2c14879 28-May-2015 Stephen McConnell <slm@FreeBSD.org>

The wrong commit message was given with r283632. This is the correct message.

- Updated all files with 2015 Avago copyright, and updated LSI's copyright
dates.

- Changed all of the PCI device strings from LSI to Avago Technologies (LSI).

- Added a sysctl variable to control how StartStopUnit behavior works. User can
select to spin down disks based on if disk is SSD or HDD.

- Inquiry data is required to tell if a disk will support SSU at shutdown or
not. Due to the addition of mpssas_async, which gets Advanced Info but not
Inquiry data, the setting of supports_SSU was moved to the
mpssas_scsiio_complete function, which snoops for any Inquiry commands. And,
since disks are shutdown as a target and not a LUN, this process was
simplified by basing it on targets and not LUNs.

- Added a sysctl variable that sets the amount of time to retry after sending a
failed SATA ID command. This helps with some bad disks and large disks that
require a lot of time to spin up. Part of this change was to add a callout to
handle timeouts with the SATA ID command. The callout function is called
mpssas_ata_id_timeout(). (Fixes PR 191348)

- Changed the way resets work by allowing I/O to continue to devices that are
not currently under a reset condition. This uses devq's instead of simq's and
makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function
called mpssas_prepare_tm().

- Some changes were made to reduce code duplication when getting a SAS address
for a SATA disk.

- Fixed some formatting and whitespace.

- Bump version of mps driver to 9.255.01.00-fbsd

PR: 191348
Reviewed by: ken, scottl
Approved by: ken, scottl
MFC after: 1 week


# c91d307f 28-May-2015 Stephen McConnell <slm@FreeBSD.org>

The wrong commit message was given with r283632. To get the correct commit
message synced to the changes in r283632, those changes are now backed out.
Another commit will be done that is exactly the same as r283632 except it will
have to correct commit message.

Approved by: ken, scottl, asomers, gibbs


# 2ca937d2 27-May-2015 Stephen McConnell <slm@FreeBSD.org>

This setting of stop_at_shutdown should have been removed with r279253

Approved by: ken
MFC after: 1 week


# f0188618 21-Oct-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix multiple incorrect SYSCTL arguments in the kernel:

- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after: 3 days
Sponsored by: Mellanox Technologies


# 09e25c0e 06-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Remove some debugging code.

Submitted by: Steve McConnell <stephen.mcconnell@avagotech.com>
MFC after: 3 days


# 991554f2 02-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers.

This is derived from the mps(4) driver, but it supports only the 12Gb
IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.

Some notes about this driver:
o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
this driver.

o WarpDrive functionality has been removed, since it isn't supported in
the 12Gb driver interface.

o The Scatter/Gather list handling code is significantly different between
the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather
lists.

Thanks to LSI for developing and testing this driver for FreeBSD.

share/man/man4/mpr.4:
mpr(4) man page.

sys/dev/mpr/*:
mpr(4) driver files.

sys/modules/Makefile,
sys/modules/mpr/Makefile:
Add a module Makefile for the mpr(4) driver.

sys/conf/files:
Add the mpr(4) driver.

sys/amd64/conf/GENERIC,
sys/i386/conf/GENERIC,
sys/mips/conf/OCTEON1,
sys/sparc64/conf/GENERIC:
Add the mpr(4) driver to all config files that currently
have the mps(4) driver.

sys/ia64/conf/GENERIC:
Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
config file.

sys/i386/conf/XEN:
Exclude the mpr module from building here.

Submitted by: Steve McConnell <Stephen.McConnell@lsi.com>
MFC after: 3 days
Tested by: Chris Reeves <chrisr@spectralogic.com>
Sponsored by: LSI, Spectra Logic
Relnotes: LSI 12Gb SAS driver mpr(4) added