History log of /freebsd-current/sys/dev/mpr/mpr_sas.c
Revision Date Author Comments
# 8c4ee0b2 23-Nov-2023 Alexander Motin <mav@FreeBSD.org>

Use xpt_path_sbuf() in few drivers

xpt_path_string() is now a wrapper around xpt_path_sbuf(). Using it
to than concatenate result to another sbuf makes no sense. Just call
xpt_path_sbuf() directly.

MFC after: 1 month


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 59dc489a 20-Jul-2023 Warner Losh <imp@FreeBSD.org>

mpr: Fix minor 'typos' comment

moving -> removing (we're removing the device)
CAM_REQ_CMO_ERROR -> CAM_REQ_ERR (the former isn't a thing)

Sponsored by: Netflix


# ca420b4e 28-Apr-2022 Warner Losh <imp@FreeBSD.org>

mpr/mps: when sending reset on removal, include target in message

It's possible for muliple drives to be departing at the same time (if
the common power rail the share goes dark, for example). To understand
what's going on better, include target and handle in the messages
announcing the reset to allow matching with other corresponding events.

MFC After: 3 days
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D35092


# e35816c1 25-Jan-2022 Warner Losh <imp@FreeBSD.org>

mpr/mps: Fix a race in diagnostic reset

There's a small race in freezing the simq when performing a diagnostic
reset. During this time, a transaction can slip through and encounter
the target id of 0. If we're still in diagnostic reset when we detect
this, return a CAM_DEVICE_NOT_THERE status. Instead, freeze the queue
and return a requeue status, similar to what we do when we're resetting
a target and a transaction get here. The race is unavoidable due to
separate locks for queue and SIM, but easy enough to detect and make
harmless.

Sponsored by: Netflix
Reviewed by: scottl, mav
Differential Revision: https://reviews.freebsd.org/D34017


# 802f8d4a 24-Jan-2022 Warner Losh <imp@FreeBSD.org>

mpr/mps: Remove write-only flag and callout

The discovery callout is initialized and cancelled only, making it
write-only. Remove a state flag associated with it being pending as well
as two defines that aren't used that are associated with it. Remove
MP?SAS_SHUTDOWN flag, which is unused.

Sponsored by: Netflix
Reviewed by: ken, scottl, mav
Differential Revision: https://reviews.freebsd.org/D33925


# 61f17c5f 24-Nov-2021 Scott Long <scottl@FreeBSD.org>

Fix "set but not used" warnings in the mpr driver. This fixes a minor
bug in error handling.


# a8837c77 21-Nov-2021 Warner Losh <imp@FreeBSD.org>

mpr: fix freeze / release mismatch in timeout code

So, if we're processing a timeout, and we've sent an ABORT to the
firmware for that timeout, but not yet received the response from the
firmware, AND we get another timeout, we queue the timeout and freeze
the queue. However, when we've finally processed them all, we only
release the queue once. This causes all I/O to halt as the devq remains
frozen forever.

Instead, only freeze the queue when we start the process (eg set INRESET
on the target). This will allow the release when all the timed out I/Os
have finished ABORTing.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33054


# 2bbaed4d 15-Nov-2021 Warner Losh <imp@FreeBSD.org>

mpr: Minor formatting changes to match mps.

Minor reformatting nits to make mprsas_scsiio_timeout match
mpssas_scsiio_timeout more closely. The differences aren't necessary and
are distracting when comparing the routines. No functional changes.

Sponsored by: Netflix


# 02d81940 14-Sep-2021 Alexander Motin <mav@FreeBSD.org>

mps/mpr(4): Move xpt_register_async() out of lock.

It fixes lock ordere reversal between SIM and device locks. Also
remove registration for AC_FOUND_DEVICE, unused for a while now.

MFC after: 1 month


# 9781c28c 20-Aug-2021 Alexander Motin <mav@FreeBSD.org>

mpr(4): Fix unmatched devq release.

Before this change devq was frozen only if some command was sent to
the target after reset started, but release was called always. This
change freezes the devq immediately, leaving mprsas_action_scsiio()
check only to cover race condition due to different lock devq use.

This should also avoid unnecessary requeue of the commands, creating
additional log noise and confusing some broken apps.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# e3c5965c 20-Aug-2021 Alexander Motin <mav@FreeBSD.org>

mpr(4): Handle mprsas_alloc_tm() errors on device removal.

SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of
only 8, while for comparison some other combinations I have report
100 or even 128. In case of large JBOD detach requirement to send
target reset command to each target same time overflows the limit,
and without adequate handling makes devices stuck in half-detached
state, preventing later re-attach.

To handle that in case of allocation error mark the target with new
MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time
something else free high priority command. With this patch I can
successfully detach/attach 102 disk JBOD from/to the SAS9305-16e.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 175ad3d0 03-Jun-2021 Kenneth D. Merry <ken@FreeBSD.org>

Fix mpr(4) and mps(4) state transitions and a use-after-free panic.

When the mpr(4) and mps(4) drivers probe a SATA device, they issue an
ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the
target is fully setup in the driver. The drivers wait for completion of
the identify command, and have a 5 second timeout. If the timeout
fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be
freed later.

That is where the use-after-free problem comes in. Once the ATA
Identify times out, the driver sends a target reset, and then frees any
identify commands that have timed out. But, once the target reset
completes, commands that were queued to the drive are returned to the
driver by the controller.

At that point, the driver (in mp{s,r}_intr_locked()) looks up the
command descriptor for that particular SMID, marks it CM_STATE_BUSY and
sends it on for completion handling.

The problem at this stage is that the command has already been freed,
and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS
are turned on, we get a panic as soon as this command is allocated,
because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY.

So, the solution is to not free ATA Identify commands that get stuck
until they actually return from the controller. Hopefully this works
correctly on older firmware versions. If not, it could result in
commands hanging around indefinitely. But, the alternative is a
use-after-free panic or assertion (in the INVARIANTS case).

This also tightens up the state transitions between CM_STATE_FREE,
CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen
once, and we have assertions to make sure that commands are in the
correct state before transitioning to the next state. Also, for each
state assertion, we print out the current state of the command if it is
incorrect.

mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes,
that controls the behavior of the dump_reqs sysctl.
If dump_reqs_alltypes is non-zero, it will dump
all commands, not just the commands that are in the
CM_STATE_INQUEUE state. (You can see the commands
that are in the queue by using mp{s,r}util debug
dumpreqs.)

Make sure that the INQUEUE -> BUSY state transition
happens in one place, the mp{s,r}_complete_command
routine.

mp{s,r}_sas.c: Make sure we print the current command type in
command state assertions.

mp{s,r}_sas_lsi.c:
Add a new completion handler,
mp{s,r}sas_ata_id_complete. This completion
handler will free data allocated for an ATA
Identify command and free the command structure.

In mp{s,r}_ata_id_timeout, do not set the command
state to CM_STATE_BUSY. The command is still in
queue in the controller. Since we were blocking
waiting for this command to complete, there was
no completion handler previously. Set the
completion handler, so that whenever the command
does come back, it will get freed properly.

Do not free ATA Identify commands that have timed
out in mp{s,r}sas_add_device(). Wait for them
to actually come back from the controller.

mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new
dump_reqs_alltypes sysctl.

Make sure we print the current state for state
transition asserts.

This was tested in the Spectra Logic test bed (as described in the
review), as well Netflix's Open Connect fleet (where panics dropped from
a dozen or two a month to zero).

Reviewed by: imp@ (who is handling the commit with ken's OK)
Sponsored by: Spectra Logic
Differential Revision: https://reviews.freebsd.org/D25476


# 7608b98c 21-May-2021 Edward Tomasz Napierala <trasz@FreeBSD.org>

mpr, mps: clear CCBs allocated on the stack

Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D30301


# 71900a79 02-Mar-2021 Alfredo Dal'Ava Junior <alfredo@FreeBSD.org>

mpr: big-endian support

This fixes mpr driver on big-endian devices.
Tested on powerpc64 and powerpc64le targets using a SAS9300-8i card
(LSISAS3008 pci vendor=0x1000 device=0x0097)

Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25785


# 88364968 25-Oct-2020 Alexander Motin <mav@FreeBSD.org>

Introduce support of SCSI Command Priority.

SAM-3 specification introduced concept of Task Priority, that was renamed
to Command Priority in SAM-4, and supported by all modern SCSI transports.
It provides 15 levels of relative priorities: 1 - highest, 15 - lowest and
0 - default. SAT specification for SATA devices translates priorities 1-3
into NCQ high priority.

This change adds new "priority" field into empty spots of struct ccb_scsiio
and struct ccb_accept_tio of CAM and struct ctl_scsiio of CTL. Respective
support is added into iscsi(4), isp(4), mpr(4), mps(4) and ocs_fc(4) drivers
for both initiator and where applicable target roles. Minimal support was
added to CTL to receive the priority value from different frontends, pass it
between HA controllers and report in few places.

This patch does not add consumers of this functionality, so nothing should
really change yet, since the field is still set to 0 (default) on initiator
and not actively used on target. Those are to be implemented separately.

I've confirmed priority working on WD Red SATA disks connected via mpr(4)
and properly transferred to CTL target via iscsi(4), isp(4) and ocs_fc(4).

While there, added missing tag_action support to ocs_fc(4) initiator role.

MFC after: 1 month
Relnotes: yes
Sponsored by: iXsystems, Inc.


# 577858c8 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

mpr: clean up empty lines in .c and .h files


# f0f20143 04-Aug-2020 Alexander Motin <mav@FreeBSD.org>

Remove extra memset() left after r342388.

This memset() wiped MPI2_FUNCTION_SCSI_TASK_MGMT set by mprsas_alloc_tm(),
that broke target reset on device removal, making later re-insertion into
the same slot impossible, since firmware was still waiting for the driver
to finish with the removed device.

MFC after: 1 week
Sponsored by: iXsystems, Inc.


# d2a5f081 27-Jul-2020 Mark Johnston <markj@FreeBSD.org>

mpr(4), mps(4): Stop checking for failures from malloc(M_WAITOK).

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org>
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25766


# a2386b6f 13-Mar-2020 Alexander Motin <mav@FreeBSD.org>

Increase buffer in mprsas_log_command() from 192 to 224 bytes.

192 bytes are not enough to print long commands, such as ATA COMMAND PASS
THROUGH(16), that makes debug output difficult to read.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 0d87f3c7 26-Feb-2020 Warner Losh <imp@FreeBSD.org>

Remove support for all pre FreeBSD 11.0 versions from mpr and mps.

Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10
was branched over 6 years ago. All of these changes date from about that time or
earlier. These workarounds are extensive and get in the way of understanding
the current flow in the driver.


# 4c1cdd4a 24-Feb-2020 Warner Losh <imp@FreeBSD.org>

Before issing the REMOVE_DEVICE command to the firmware, make sure that all
commands have completed.

It's not OK to force complete any pending commands before we send the
REMOVE_DEVICE. Instead, make sure that all pending commands are complete before
sending that. By trying to second guess the firmware here, we run the risk of
completing commands twice, which leads to corruption.

This removes the forced completion of commands introduced in r218811. So it's a
partial backout of that commit, but replaces it with a more rebust
mechanism. Either these commands will complete due to the TARGET RESET, or they
will timeout and be aborted, but they will all complete.

Add assert that all commands are complete to REMOVE_DEVICE completion
routine. We attempt to assure this programatically, so we shouldn't have any
commands in the queue because we've waited for them all. Any commands that make
it into our action routine after we mark the target in removal will complete
immediately with an error.

When we're removing a target that's not a volume, advertise up the stack that
it's actually gone, as opposed to having a transient selection error we should
retry. Do this both in the action routine, and when we get a notification of an
aborted command. We don't do this for volumes because the driver tries hard not
to advertise to the OS a volume has disappeared.

Apply these changes to both mpr and mps since they are based on quite similar
designs.

Discussed with: scottl@
Differential Revision: https://reviews.freebsd.org/D23768


# 57aa9163 24-Nov-2019 Warner Losh <imp@FreeBSD.org>

Fix leak in state machine for commands.

When we get a device departed message from the firmware, we send a TARGET_REST
to the device to let the firmware know we're done and as part of the recovery
process. This will abort all the commands. While the documentation says the IOC
is responsible for writing the completion message for all the commands pending
with an aborted status, we sometimes have queued commands for the target that
haven't been completed so are in the INQUEUE state. So, when we later complete
the pending CCB as aborted, these commands are freed and we hit the "state not
busy" panic.

Elsewhere where we dequeue commands, we move the state to BUSY from INQUEUE. Do
that here as well. In talking to Ken, Scott and Justin, they recommended a
series of tests to see if this is 100% safe. Those tests are ongoing, but
preliminary tests suggest this is safe as we see no duplicate completions when
we hit this case at work. We have a machine that has a dodgy powersupply which
usually doesn't apply power to a few drives, but sometimes does when the machine
is under heavy load so we get a rash of the connect / disconnect messages over
half an hour. Without this change, we'd see state not busy panic. With this
change, the drives just annoyingly come and go without affecting the rest of the
machine, but without a complete error injection test suite, it's hard to know if
all edge cases are now covered or not.

Discussed with: scottl, ken, gibbs


# 8fe7bf06 08-Jul-2019 Warner Losh <imp@FreeBSD.org>

Fix bugs in recovery path and improve cm tracking

Eliminate the TIMEDOUT state. This state really conveyed two different
concepts: I timed out during recovery (and my command got put on the
recovery queue), and I timed out diring discovery (which doesn't).
Separate those two concepts into two flags. Use the TIMEDOUT flag to
fail requests as timed out. Use the on queue flag to remove them from
the queue.

In mps_intr_locked for MPI2_RPY_DESCRIPT_FLAGS_ADDRESS_REPLY message
type, when completing commands, ignore the ones that are not in state
INQUEUE. They were already completed as part of the recovery
process. When we complete them twice, we wind up with entries on the
free queue that are marked as busy, trigging asserts.

Reviewed by: scottl (earlier version, just for mpr)
Differential Revision: https://reviews.freebsd.org/D20785


# 46b23587 26-Dec-2018 Kashyap D Desai <kadesai@FreeBSD.org>

Update copyright information

Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc


# 89d1c21f 26-Dec-2018 Kashyap D Desai <kadesai@FreeBSD.org>

Added support for NVMe Task Management

Following list of changes done in the driver as a part of TM handling on the NVMe drives.
Below changes are only applicable on NVMe drives and only when custom NVMe TM handling bit is set to zero by IOC.

1. Issue LUN reset & Target reset TMs with Target reset method field set to Protocol Level reset (0x3),
2. For LUN & target reset TMs use the timeout value as ControllerResetTO value provided by firmware using PCie Device Page 0,
3. If LUN reset fails to terminates the IO then directly escalate to host reset instead of going for target reset TM,
4. For Abort TM use the timeout value as NVMeAbortTO value given by the IOC using Manufacturing Page 11,
5. Log message "PCie Host Reset failed" message up on receiving P

Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc


# 46b9415f 23-Dec-2018 Scott Long <scottl@FreeBSD.org>

Further refactoring for task management commands. Also fix a related
typo from the previous commit.


# 3921a9f7 23-Dec-2018 Scott Long <scottl@FreeBSD.org>

Commands for user-initated device resets should come from the high-priority
allocator. Prior to this change, they would leak from the normal allocator.


# b7f1ee79 23-Dec-2018 Scott Long <scottl@FreeBSD.org>

First step in refactoring and fixing the error recovery and task management
code in the mpr and mps drivers. Eliminate duplicated code and fix some
comments.


# 8277ce2b 21-Dec-2018 Conrad Meyer <cem@FreeBSD.org>

mps(4), mpr(4): Fix lifetime of command buffer for mp?sas_get_sata_identify

In the event that the ID command timed out, mps(4)/mpr(4) did not free the
command until it could be cancelled. However, it freed the associated
buffer (cm_data). Fix the lifetime issue by freeing the associated buffer
only after Abort Task or controller reset.

Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18612


# 9544e6dc 21-Aug-2018 Chuck Tuffli <chuck@FreeBSD.org>

Make NVMe compatible with the original API

The original NVMe API used bit-fields to represent fields in data
structures defined by the specification (e.g. the op-code in the command
data structure). The implementation targeted x86_64 processors and
defined the bit fields for little endian dwords (i.e. 32 bits).

This approach does not work as-is for big endian architectures and was
changed to use a combination of bit shifts and masks to support PowerPC.
Unfortunately, this changed the NVMe API and forces #ifdef's based on
the OS revision level in user space code.

This change reverts to something that looks like the original API, but
it uses bytes instead of bit-fields inside the packed command structure.
As a bonus, this works as-is for both big and little endian CPU
architectures.

Bump __FreeBSD_version to 1200081 due to API change

Reviewed by: imp, kbowling, smh, mav
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D16404


# 8881681b 23-Mar-2018 Kenneth D. Merry <ken@FreeBSD.org>

Disable T10 Protection Information / EEDP handling for type 2 protection.

The mps(4) and mpr(4) drivers and hardware handle T10 Protection
Information, which is a system of checksums and guard blocks to protect
data while it is being transferred and while it is on disk. It is also
known as T10 DIF. For more details, see section 4.22 of the SBC-4 spec.

Supporting Type 2 protection requires using 32 byte CDBs, and filling in
the fields in those CDBs. We don't yet support that in the da(4) driver.

Type 1 and Type 3 protection don't require that, and can be handled by
the mps(4)/mpr(4) driver's code and firmware without any additional
input from the da(4) driver.

If a drive has Type 2 protection enabled (you frequently see this with
SAS drives shipped from Dell), don't set the various EEDP fields in the
mps(4)/mpr(4) driver command fields. Otherwise, you wind up with errors
like this that would otherwise make no sense:

(da9:mpr0:0:18:0): READ(10). CDB: 28 00 00 00 00 00 00 02 00 00
(da9:mpr0:0:18:0): CAM status: SCSI Status Error
(da9:mpr0:0:18:0): SCSI status: Check Condition
(da9:mpr0:0:18:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
(da9:mpr0:0:18:0):
(da9:mpr0:0:18:0): Field Replaceable Unit: 0
(da9:mpr0:0:18:0): Command Specific Info: 0
(da9:mpr0:0:18:0):
(da9:mpr0:0:18:0): Descriptor 0x80: f8 21
(da9:mpr0:0:18:0): Descriptor 0x81: 00 00 00 00 00 00
(da9:mpr0:0:18:0): Error 22, Unretryable error

In other words, what kind of strange SAS hard drive doesn't support a
standard 10 byte SCSI READ command? In this case, one that has Type 2
protection enabled.

We can revisit this when we put Type 2 protection support in the da(4)
driver, but for now this will help people who put Type 2 formatted drives
in a system and wonder what in the world is going on.

MFC after: 3 days
Sponsored by: Spectra Logic


# 5f5baf0e 19-Mar-2018 Alexander Motin <mav@FreeBSD.org>

Update mpr(4) driver from v15 to v18 from Broadcom site.

Version 16 is just a number bump, since we already had those changes.

Version 17 introduces new AdapterType value, that allows new user-space
tools from Broadcom to differentiate adapter generations 3 and 3.5.

Version 18 updates headers and adds SAS_DEVICE_DISCOVERY_ERROR reporting.

MFC after: 2 weeks


# 0d787e9b 22-Feb-2018 Wojciech Macek <wma@FreeBSD.org>

NVMe: Add big-endian support

Remove bitfields from defined structures as they are not portable.
Instead use shift and mask macros in the driver and nvmecontrol application.

NVMe is now working on powerpc64 host.

Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Reviewed by: imp, wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13916


# f0779b04 18-Feb-2018 Scott Long <scottl@FreeBSD.org>

Improve command lifecycle debugging and detection of problems.

Sponsored by: Netflix


# 4f5d6573 09-Feb-2018 Alexander Motin <mav@FreeBSD.org>

Teach mps(4) and mpr(4) drivers to autotune chain frames.

This is a first part of the change. It makes the drivers to calculate
the required number of chain frames to satisfy worst case scenarios, but
it does not change existing overly strict limits on them. The next step
will be to rewrite the allocator to not require megabytes of physically
contiguous address space, that may be problematic if done after boot,
after doing which the limits can be removed. Until that this code can
just correct user set limits, if they are set too high.

Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14261


# 62a09ee9 06-Feb-2018 Alexander Motin <mav@FreeBSD.org>

Fix queue length reporting in mps(4) and mpr(4).

Both drivers were found to report CAM bigger queue depth then they really
can handle. It made them later under high load with many disks return
some of submitted requests back with CAM_REQUEUE_REQ status for later
resubmission.

Reviewed by: scottl
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14215


# 3c5ac992 10-Sep-2017 Scott Long <scottl@FreeBSD.org>

Add infrastructure for allocating multiple MSI-X interrupts. Also
add more fine-tuned controls for allocating requests and replies.

Sponsored by: Netflix


# a4bb51a4 10-Sep-2017 Scott Long <scottl@FreeBSD.org>

Fix intrhook release in MPR and MPS for EARLY_AP_STARTUP.

Reported by: Limelight
Sponsored by: Netflix


# 2bf620cb 09-Sep-2017 Scott Long <scottl@FreeBSD.org>

Convert some in-line printing of diagnostic into tables.

Sponsored by: Netflix


# a7d065b3 09-Sep-2017 Scott Long <scottl@FreeBSD.org>

Remove the unnecessary use of a temporary string buffer.

Sponsored by: Netflix


# 6eea4f46 06-Sep-2017 Scott Long <scottl@FreeBSD.org>

Checkpoint the next phase in debug message cleanup, this time focusing on
error recovery messages.

Sponsored by: Netflix


# 757ff642 27-Aug-2017 Scott Long <scottl@FreeBSD.org>

Start overhauling debug printing in the MPS and MPR drivers. The focus of this
commit it to make initiazation less chatty in the normal case, and more useful
and informative when real debugging is turned on.

Reviewed by: ken (earlier version)
Sponsored by: Netflix


# 6d4ffcb4 10-Aug-2017 Kenneth D. Merry <ken@FreeBSD.org>

Changes to make mps(4) and mpr(4) handle reinit with reallocation.

When the mps(4) and mpr(4) drivers need to reinitialize the
firmware, they sometimes need to reallocate all of the memory
allocated by the driver. The reallocation happens whenever the IOC
Facts change. That should only happen after a firmware upgrade.

If the reinitialization happens as a result of a timed out command
sent to the card, the command that timed out and triggered the
reinit may have been freed if iocfacts_allocate() reallocated all
memory. If the caller attempts to access the command after that,
the kernel will panic because the caller will be dereferencing
freed memory.

The solution is to set a flag in the softc when we reallocate,
and avoid dereferencing the command strucure if we've reallocated.

The changes are largely the same in both drivers, since mpr(4) is a
derivative of mps(4).

o In iocfacts_allocate(), if the IOC Facts have changed and we
need to reallocate, set the REALLOCATED flag in the softc.

o Change wait_command() to take a struct mps_command ** instead of
a struct mps_command *. This allows us to NULL out the caller's
command pointer if we have to reinit the controller and the data
structures get reallocated. (The REALLOCATED flag will be set
in the softc if that has happened.)

o In every place that calls wait_command(), make sure we handle
the case where the command is NULL after the call.

o The mpr(4) driver has mpr_request_polled() which can also
reinitialize the card. Also check for reallocation there.

Reviewed by: scottl, slm
MFC after: 1 week
Sponsored by: Spectra Logic


# 6c85e33e 25-Jul-2017 Scott Long <scottl@FreeBSD.org>

Quiet a message that sounds far more dire than it really is.


# 327f2e6c 25-May-2017 Stephen McConnell <slm@FreeBSD.org>

Fix several problems with mapping code.

Reviewed by: ken, scottl, asomers, ambrisko, mav
Approved by: ken, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10861


# 18982e8f 22-May-2017 Stephen McConnell <slm@FreeBSD.org>

Fix powerpc compiler error.

Approved by: ken


# 67feec50 17-May-2017 Stephen McConnell <slm@FreeBSD.org>

Add tri-mode support (SAS/SATA/PCIe).

This includes NVMe device support and adds support for the following adapters:
SAS 3408
SAS 3416
SAS 3508
SAS 3516
SAS 3616
SAS 3708
SAS 3716

Reviewed by: ken, scottl, asomers, mav
Approved by: ken, scottl, mav
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D10095


# 855fe445 11-May-2017 Scott Long <scottl@FreeBSD.org>

Improve error messages during command timeout for the mpr and mps
drivers.

Sponsored by: Netflix


# c11c484f 19-Jan-2017 Scott Long <scottl@FreeBSD.org>

Rework the debug print API. Event printing no longer gets special handling.
All of the printing from the tables file now has wrappers so that the
handling is cleaner and it's possible to print something out (say, during
development) without having to fight the global debug flags. This re-org
will also make it easier to have the tables be compiled out at build time
if desired.

Other than fixing some minor bugs, there are no user-visible changes from
this change

Sponsored by: Netflix, Inc.
Differential Revision: D9238


# 4195c7de 04-Jan-2017 Alan Somers <asomers@FreeBSD.org>

Always null-terminate ccb_pathinq.(sim_vid|hba_vid|dev_name)

The sim_vid, hba_vid, and dev_name fields of struct ccb_pathinq are
fixed-length strings. AFAICT the only place they're read is in
sbin/camcontrol/camcontrol.c, which assumes they'll be null-terminated.
However, the kernel doesn't null-terminate them. A bunch of copy-pasted code
uses strncpy to write them, and doesn't guarantee null-termination. For at
least 4 drivers (mpr, mps, ciss, and hyperv), the hba_vid field actually
overflows. You can see the result by doing "camcontrol negotiate da0 -v".

This change null-terminates those fields everywhere they're set in the
kernel. It also shortens a few strings to ensure they'll fit within the
16-character field.

PR: 215474
Reported by: Coverity
CID: 1009997 1010000 1010001 1010002 1010003 1010004 1010005
CID: 1331519 1010006 1215097 1010007 1288967 1010008 1306000
CID: 1211924 1010009 1010010 1010011 1010012 1010013 1010014
CID: 1147190 1010017 1010016 1010018 1216435 1010020 1010021
CID: 1010022 1009666 1018185 1010023 1010025 1010026 1010027
CID: 1010028 1010029 1010030 1010031 1010033 1018186 1018187
CID: 1010035 1010036 1010042 1010041 1010040 1010039
Reviewed by: imp, sephe, slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D9037
Differential Revision: https://reviews.freebsd.org/D9038


# fa699bb2 03-Jan-2017 Alan Somers <asomers@FreeBSD.org>

misc minor fixes in mpr(4)

sys/dev/mpr/mpr_sas.c
* Fix a potential null pointer dereference (CID 1305731)
* Check for overrun of the ccb_scsiio.cdb_io.cdb_bytes buffer (CID
1211934)

sys/dev/mpr/mpr_sas_lsi.c
* Nullify a dangling pointer in mprsas_get_sata_identify
* Fix a memory leak in mprsas_SSU_to_SATA_devices (CID 1211935)

Reported by: Coverity (partially)
CID: 1305731 1211934 1211935
Reviewed by: slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D8880


# 694cb8b8 04-Nov-2016 Scott Long <scottl@FreeBSD.org>

Record the LogInfo field when reporting the IOCStatus. Helps in
debugging errors.

Submitted by: slm
Obtained from: Netflix
MFC after: 3 days


# 32b0a21e 12-Jul-2016 Stephen McConnell <slm@FreeBSD.org>

Use real values to calculate Max I/O size instead of guessing.

Reviewed by: ken, scottl
Approved by: ken, scottl, ambrisko (mentors)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7043


# 58581c13 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Disks can go missing until a reboot is done in some cases.

This is due to the DevHandle not being released, which causes the Firmware to
not allow that disk to be re-added.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6102


# 407073a0 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Use callout_reset_sbt() instead of callout_reset() if FreeBSD ver is >= 1000029

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6101


# dd9f4a95 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

No need to set the MPRSAS_SHUTDOWN flag because it's never used.

Approved by: ken, scottl, ambrisko
MFC after: 1 week


# 88619392 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Fix possible use of invalid pointer.

It was possible to use an invalid pointer to get the target ID value. To fix
this, initialize a local Target ID variable to an invalid value and change that
variable to a valid value only if the pointer to the Target ID is not NULL.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6100


# d3f6eabf 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

No log bit in IOCStatus and endian-safe changes.

Use MPI2_IOCSTATUS_MASK when checking IOCStatus to mask off the log bit, and
make a few more things endian-safe.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6097


# 2bbc5fcb 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Add support for the Broadcom (Avago/LSI) 9305 16 and 24 port HBA's.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D6098


# 7a2a6a1a 09-May-2016 Stephen McConnell <slm@FreeBSD.org>

Several style changes and add copyrights for 2016.

Reviewed by: ken, scottl, ambrisko, asomers
Approved by: ken, scottl, ambrisko
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6103


# 6adfa7ed 05-May-2016 Alan Somers <asomers@FreeBSD.org>

mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors

Submitted by: ken
Reviewed by: slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D6210


# a2c14879 28-May-2015 Stephen McConnell <slm@FreeBSD.org>

The wrong commit message was given with r283632. This is the correct message.

- Updated all files with 2015 Avago copyright, and updated LSI's copyright
dates.

- Changed all of the PCI device strings from LSI to Avago Technologies (LSI).

- Added a sysctl variable to control how StartStopUnit behavior works. User can
select to spin down disks based on if disk is SSD or HDD.

- Inquiry data is required to tell if a disk will support SSU at shutdown or
not. Due to the addition of mpssas_async, which gets Advanced Info but not
Inquiry data, the setting of supports_SSU was moved to the
mpssas_scsiio_complete function, which snoops for any Inquiry commands. And,
since disks are shutdown as a target and not a LUN, this process was
simplified by basing it on targets and not LUNs.

- Added a sysctl variable that sets the amount of time to retry after sending a
failed SATA ID command. This helps with some bad disks and large disks that
require a lot of time to spin up. Part of this change was to add a callout to
handle timeouts with the SATA ID command. The callout function is called
mpssas_ata_id_timeout(). (Fixes PR 191348)

- Changed the way resets work by allowing I/O to continue to devices that are
not currently under a reset condition. This uses devq's instead of simq's and
makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function
called mpssas_prepare_tm().

- Some changes were made to reduce code duplication when getting a SAS address
for a SATA disk.

- Fixed some formatting and whitespace.

- Bump version of mps driver to 9.255.01.00-fbsd

PR: 191348
Reviewed by: ken, scottl
Approved by: ken, scottl
MFC after: 1 week


# c91d307f 28-May-2015 Stephen McConnell <slm@FreeBSD.org>

The wrong commit message was given with r283632. To get the correct commit
message synced to the changes in r283632, those changes are now backed out.
Another commit will be done that is exactly the same as r283632 except it will
have to correct commit message.

Approved by: ken, scottl, asomers, gibbs


# 2ca937d2 27-May-2015 Stephen McConnell <slm@FreeBSD.org>

This setting of stop_at_shutdown should have been removed with r279253

Approved by: ken
MFC after: 1 week


# 6c0541fa 26-Feb-2015 Kenneth D. Merry <ken@FreeBSD.org>

Add FreeBSD stable/10 version checks for the availability of the
CDAI_FLAG_NONE advanced information CCB flag.

Support for the flag was merged to stable/10 in r279329, and the
__FreeBSD_version in stable/10 was bumped to 1001510.

Check for that version in the mps(4) and mpr(4) drivers when determining
whether to use the flag.

Sponsored by: Spectra Logic
MFC after: 3 days


# e8577fb4 18-Feb-2015 Kenneth D. Merry <ken@FreeBSD.org>

Make sure that the flags for the XPT_DEV_ADVINFO CCB are initialized
properly.

If there is garbage in the flags field, it can sometimes include a
set CDAI_FLAG_STORE flag, which may cause either an error or
perhaps result in overwriting the field that was intended to be
read.

sys/cam/cam_ccb.h:
Add a new flag to the XPT_DEV_ADVINFO CCB, CDAI_FLAG_NONE,
that callers can use to set the flags field when no store
is desired.

sys/cam/scsi/scsi_enc_ses.c:
In ses_setphyspath_callback(), explicitly set the
XPT_DEV_ADVINFO flags to CDAI_FLAG_NONE when fetching the
physical path information. Instead of ORing in the
CDAI_FLAG_STORE flag when storing the physical path, set
the flags field to CDAI_FLAG_STORE.

sys/cam/scsi/scsi_sa.c:
Set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE when
fetching extended inquiry information.

sys/cam/scsi/scsi_da.c:
When storing extended READ CAPACITY information, set the
XPT_DEV_ADVINFO flags field to CDAI_FLAG_STORE instead of
ORing it into a field that isn't initialized.

sys/dev/mpr/mpr_sas.c,
sys/dev/mps/mps_sas.c:
When fetching extended READ CAPACITY information, set the
XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE instead of
setting it to 0.

sbin/camcontrol/camcontrol.c:
When fetching a device ID, set the XPT_DEV_ADVINFO flags
field to CDAI_FLAG_NONE instead of 0.

sys/sys/param.h:
Bump __FreeBSD_version to 1100061 for the new XPT_DEV_ADVINFO
CCB flag, CDAI_FLAG_NONE.

Sponsored by: Spectra Logic
MFC after: 1 week


# 85c9dd9d 21-Nov-2014 Steven Hartland <smh@FreeBSD.org>

Prevent overflow issues in timeout processing

Previously, any timeout value for which (timeout * hz) will overflow the
signed integer, will give weird results, since callout(9) routines will
convert negative values of ticks to '1'. For unsigned integer overflow we
will get sufficiently smaller timeout values than expected.

Switch from callout_reset, which requires conversion to int based ticks
to callout_reset_sbt to avoid this.

Also correct isci to correctly resolve ccb timeout.

This was based on the original work done by Eygene Ryabinkin
<rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid
the overlow.

Differential Revision: https://reviews.freebsd.org/D1157
Reviewed by: mav, davide
MFC after: 1 month
Sponsored by: Multiplay


# 82315915 08-Oct-2014 Alexander Motin <mav@FreeBSD.org>

Properly report 12Gbps connection rate.

Reviewed by: kadesai, slm
MFC after: 1 week


# c2a0f07a 24-May-2014 Alexander Motin <mav@FreeBSD.org>

Increase taskqueue thread priority from idle to PRIBIO.

Idle priority is not even time-share, so if system is busy in any way,
those events may never be executed. Since in some cases system waits
for events processed by that thread, that may cause deadlocks.


# a371d6f9 08-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Add #ifdefs in the mpr(4) driver so that versions of stable/9 that
have implemented the PIM_NOSCAN rescan functionality will have it
enabled.

This is a no-op for head.

Reviewed by: slm
Sponsored by: Spectra Logic Corporation
MFC after: 3 days


# d2b4e18b 08-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Fix TLR (Transport Layer Retry) support in the mps(4) and mpr(4) drivers.

TLR is necessary for reliable communication with SAS tape drives.

This was broken by change 246713 in the mps(4) driver. It changed the
cm_data field for SCSI I/O requests to point to the CCB instead of the data
buffer. So, instead, look at the CCB's data pointer to determine whether
or not we're talking to a tape drive.

Also, take the residual into account to make sure that we don't go off the
end of the request.

MFC after: 3 days
Sponsored by: Spectra Logic Corporation


# 07aa4de1 06-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Fix a problem with async notifications in the mpr(4) driver.

This problem only occurs on versions of FreeBSD prior to the recent CAM
locking changes. (i.e. stable/9 and older versions of stable/10) This
change should be a no-op for head and stable/10.

If a path isn't specified, xpt_register_async() will create a fully
wildcarded path and acquire a lock (the XPT lock in older versions,
and via xpt_path_lock() in newer versions) to call xpt_action() for the
XPT_SASYNC_CB CCB. It will then drop the lock and if the requested event
includes AC_FOUND_DEVICE or AC_PATH_REGISTERED, it will get the caller up
to date with any device arrivals or path registrations.

The issue is that before the locking changes, each SIM lock would get
acquired in turn during the EDT tree traversal process. If a path is
specified for xpt_register_async(), it won't acquire and drop its own lock,
but instead expects the caller to hold its own SIM lock. That works for
the first part of xpt_register_async(), but causes a recursive lock
acquisition once the EDT traversal happens and it comes to the SIM in
question. And it isn't possible to call xpt_action() without holding a SIM
lock.

The locking changes fix this by using the XPT topology lock for EDT
traversal, so it is no longer an issue to hold the SIM lock while calling
xpt_register_async().

The solution for FreeBSD versions before the locking changes is to request
notification of all device arrivals (so we pass a NULL path into
xpt_register_async()) and then filter out the arrivals that are not ours.

MFC After: 3 days
Sponsored by: Spectra Logic Corporation


# c503306d 05-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Adjust #if statements inside mprsas_send_smpcmd() to more accurately
reflect when unmapped I/O support was added.

For FreeBSD 10, it arrived just prior to __FreeBSD_version 1000028.
For FreeBSD 9, it arrived just prior to __FreeBSD_version 902001.

Also, fix compiler warnings in mprsas_send_smpcmd() that happen in the
i386 PAE build for non-unmapped I/O builds. These were fixed in mps(4)
in revision 241145, but didn't make it into the mpr(4) driver. This
change should only affect FreeBSD versions outside the above revisions,
and thus doesn't affect head.

MFC after: 3 days
Sponsored by: Spectra Logic Corporation


# 991554f2 02-May-2014 Kenneth D. Merry <ken@FreeBSD.org>

Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers.

This is derived from the mps(4) driver, but it supports only the 12Gb
IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.

Some notes about this driver:
o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
this driver.

o WarpDrive functionality has been removed, since it isn't supported in
the 12Gb driver interface.

o The Scatter/Gather list handling code is significantly different between
the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather
lists.

Thanks to LSI for developing and testing this driver for FreeBSD.

share/man/man4/mpr.4:
mpr(4) man page.

sys/dev/mpr/*:
mpr(4) driver files.

sys/modules/Makefile,
sys/modules/mpr/Makefile:
Add a module Makefile for the mpr(4) driver.

sys/conf/files:
Add the mpr(4) driver.

sys/amd64/conf/GENERIC,
sys/i386/conf/GENERIC,
sys/mips/conf/OCTEON1,
sys/sparc64/conf/GENERIC:
Add the mpr(4) driver to all config files that currently
have the mps(4) driver.

sys/ia64/conf/GENERIC:
Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
config file.

sys/i386/conf/XEN:
Exclude the mpr module from building here.

Submitted by: Steve McConnell <Stephen.McConnell@lsi.com>
MFC after: 3 days
Tested by: Chris Reeves <chrisr@spectralogic.com>
Sponsored by: LSI, Spectra Logic
Relnotes: LSI 12Gb SAS driver mpr(4) added