History log of /freebsd-current/sys/dev/mps/mps_sas_lsi.c
Revision Date Author Comments
# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# bcce9c5b 24-Nov-2021 Scott Long <scottl@FreeBSD.org>

Fix "set but not used" warnings in the mps driver.


# 175ad3d0 03-Jun-2021 Kenneth D. Merry <ken@FreeBSD.org>

Fix mpr(4) and mps(4) state transitions and a use-after-free panic.

When the mpr(4) and mps(4) drivers probe a SATA device, they issue an
ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the
target is fully setup in the driver. The drivers wait for completion of
the identify command, and have a 5 second timeout. If the timeout
fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be
freed later.

That is where the use-after-free problem comes in. Once the ATA
Identify times out, the driver sends a target reset, and then frees any
identify commands that have timed out. But, once the target reset
completes, commands that were queued to the drive are returned to the
driver by the controller.

At that point, the driver (in mp{s,r}_intr_locked()) looks up the
command descriptor for that particular SMID, marks it CM_STATE_BUSY and
sends it on for completion handling.

The problem at this stage is that the command has already been freed,
and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS
are turned on, we get a panic as soon as this command is allocated,
because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY.

So, the solution is to not free ATA Identify commands that get stuck
until they actually return from the controller. Hopefully this works
correctly on older firmware versions. If not, it could result in
commands hanging around indefinitely. But, the alternative is a
use-after-free panic or assertion (in the INVARIANTS case).

This also tightens up the state transitions between CM_STATE_FREE,
CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen
once, and we have assertions to make sure that commands are in the
correct state before transitioning to the next state. Also, for each
state assertion, we print out the current state of the command if it is
incorrect.

mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes,
that controls the behavior of the dump_reqs sysctl.
If dump_reqs_alltypes is non-zero, it will dump
all commands, not just the commands that are in the
CM_STATE_INQUEUE state. (You can see the commands
that are in the queue by using mp{s,r}util debug
dumpreqs.)

Make sure that the INQUEUE -> BUSY state transition
happens in one place, the mp{s,r}_complete_command
routine.

mp{s,r}_sas.c: Make sure we print the current command type in
command state assertions.

mp{s,r}_sas_lsi.c:
Add a new completion handler,
mp{s,r}sas_ata_id_complete. This completion
handler will free data allocated for an ATA
Identify command and free the command structure.

In mp{s,r}_ata_id_timeout, do not set the command
state to CM_STATE_BUSY. The command is still in
queue in the controller. Since we were blocking
waiting for this command to complete, there was
no completion handler previously. Set the
completion handler, so that whenever the command
does come back, it will get freed properly.

Do not free ATA Identify commands that have timed
out in mp{s,r}sas_add_device(). Wait for them
to actually come back from the controller.

mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new
dump_reqs_alltypes sysctl.

Make sure we print the current state for state
transition asserts.

This was tested in the Spectra Logic test bed (as described in the
review), as well Netflix's Open Connect fleet (where panics dropped from
a dozen or two a month to zero).

Reviewed by: imp@ (who is handling the commit with ken's OK)
Sponsored by: Spectra Logic
Differential Revision: https://reviews.freebsd.org/D25476


# 742c5f20 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

mps: clean up empty lines in .c and .h files


# 0d87f3c7 26-Feb-2020 Warner Losh <imp@FreeBSD.org>

Remove support for all pre FreeBSD 11.0 versions from mpr and mps.

Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10
was branched over 6 years ago. All of these changes date from about that time or
earlier. These workarounds are extensive and get in the way of understanding
the current flow in the driver.


# 8fe7bf06 08-Jul-2019 Warner Losh <imp@FreeBSD.org>

Fix bugs in recovery path and improve cm tracking

Eliminate the TIMEDOUT state. This state really conveyed two different
concepts: I timed out during recovery (and my command got put on the
recovery queue), and I timed out diring discovery (which doesn't).
Separate those two concepts into two flags. Use the TIMEDOUT flag to
fail requests as timed out. Use the on queue flag to remove them from
the queue.

In mps_intr_locked for MPI2_RPY_DESCRIPT_FLAGS_ADDRESS_REPLY message
type, when completing commands, ignore the ones that are not in state
INQUEUE. They were already completed as part of the recovery
process. When we complete them twice, we wind up with entries on the
free queue that are marked as busy, trigging asserts.

Reviewed by: scottl (earlier version, just for mpr)
Differential Revision: https://reviews.freebsd.org/D20785


# e7ef108c 07-May-2019 Warner Losh <imp@FreeBSD.org>

Add missing newline to debug printf.


# 86312e46 21-Dec-2018 Conrad Meyer <cem@FreeBSD.org>

mps(4), mpr(4): remove SATA ID command cancellation hack

Add a generic mechanism to override mp?_wait_command's timeout behavior,
which continues to invoke reinit by default. Invokers who set
cm_timeout_handler may avoid automatic reinit and do their own handling.

Adapt mp?sas_get_sata_identify to this mechanism and remove its callout
hack.

Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18614


# 8277ce2b 21-Dec-2018 Conrad Meyer <cem@FreeBSD.org>

mps(4), mpr(4): Fix lifetime of command buffer for mp?sas_get_sata_identify

In the event that the ID command timed out, mps(4)/mpr(4) did not free the
command until it could be cancelled. However, it freed the associated
buffer (cm_data). Fix the lifetime issue by freeing the associated buffer
only after Abort Task or controller reset.

Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18612


# acc173a6 13-Aug-2018 Warner Losh <imp@FreeBSD.org>

Port the mps panic-safe shutdown_final handling to mpr

r330951 by smh fixed the mps driver to avoid deadlocks when panicing.
The same code is needed for mpr, so port it here, along with the fix
which allows the CCBs scheduled to complete avoiding at least a scary
message and likely other unintended consequences.

Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D16663


# d4b95382 13-Aug-2018 Warner Losh <imp@FreeBSD.org>

Call xpt_sim_poll in shutdown_final handler.

When we're shutting down, we send a number of start/stop commands to
the known targets. We have to wait for them to complete. During a
panic, the interrupts are off, and using pause to wait for them to
fire and complete won't work: we have to poll after pause returns so
the completion routines of the CCBs run so we decrement work
outstanding counts.

Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D16663


# 7d147b81 14-Mar-2018 Steven Hartland <smh@FreeBSD.org>

Fix mps deadlock when handling panic

During shutdown mps waits for its SSU requests to complete however when
performing a reboot after handling a panic the scheduler is stopped so
getmicrotime which is used can be non-functional.

Switch to using the same method as shutdown_panic to ensure we actually
complete.

In addition reduce the timeout when RB_NOSYNC is set in howto as we expect
this to fail.

Reviewed by: slm
MFC after: 1 week
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D12776


# 718cf2cc 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# aeb9ac0d 21-Sep-2017 Scott Long <scottl@FreeBSD.org>

Clean up error messages related to device discovery

Sponsored by: Netflix


# 757ff642 27-Aug-2017 Scott Long <scottl@FreeBSD.org>

Start overhauling debug printing in the MPS and MPR drivers. The focus of this
commit it to make initiazation less chatty in the normal case, and more useful
and informative when real debugging is turned on.

Reviewed by: ken (earlier version)
Sponsored by: Netflix


# 6d4ffcb4 10-Aug-2017 Kenneth D. Merry <ken@FreeBSD.org>

Changes to make mps(4) and mpr(4) handle reinit with reallocation.

When the mps(4) and mpr(4) drivers need to reinitialize the
firmware, they sometimes need to reallocate all of the memory
allocated by the driver. The reallocation happens whenever the IOC
Facts change. That should only happen after a firmware upgrade.

If the reinitialization happens as a result of a timed out command
sent to the card, the command that timed out and triggered the
reinit may have been freed if iocfacts_allocate() reallocated all
memory. If the caller attempts to access the command after that,
the kernel will panic because the caller will be dereferencing
freed memory.

The solution is to set a flag in the softc when we reallocate,
and avoid dereferencing the command strucure if we've reallocated.

The changes are largely the same in both drivers, since mpr(4) is a
derivative of mps(4).

o In iocfacts_allocate(), if the IOC Facts have changed and we
need to reallocate, set the REALLOCATED flag in the softc.

o Change wait_command() to take a struct mps_command ** instead of
a struct mps_command *. This allows us to NULL out the caller's
command pointer if we have to reinit the controller and the data
structures get reallocated. (The REALLOCATED flag will be set
in the softc if that has happened.)

o In every place that calls wait_command(), make sure we handle
the case where the command is NULL after the call.

o The mpr(4) driver has mpr_request_polled() which can also
reinitialize the card. Also check for reallocation there.

Reviewed by: scottl, slm
MFC after: 1 week
Sponsored by: Spectra Logic


# 055e2653 30-Jul-2017 Scott Long <scottl@FreeBSD.org>

Change from using underbar function names to normal function names for
the informational print functions. Collapse the debug API a bit to be
more generic and not require as much code duplication. While here, fix
a bug in MPS that was already fixed in MPR.


# 635e58c7 25-May-2017 Stephen McConnell <slm@FreeBSD.org>

Fix several problems with mapping code.

Reviewed by: ken, scottl, asomers, ambrisko, mav
Approved by: ken, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10878


# 4ab1cdc5 02-Nov-2016 Scott Long <scottl@FreeBSD.org>

Add a fallback to the device mapper logic. We've seen systems in the field
that are apparently misconfigured by the manufacturer and cause the mapping
logic to fail. The fallback allows drive numbers to be assigned based on the
PHY number that they're attached to. Add sysctls and tunables to overrid
this new behavior, but they should be considered only necessary for debugging.

Reviewed by: imp, smh
Obtained from: Netflix
MFC after: 3 days
Sponsored by: D8403


# f4e69c98 20-Jun-2016 Stephen McConnell <slm@FreeBSD.org>

- No log bit in IOCStatus and endian-safe changes.

Use MPI2_IOCSTATUS_MASK when checking IOCStatus to mask off the log bit, and
make a few more things endian-safe.

- Fix possible use of invalid pointer.

It was possible to use an invalid pointer to get the target ID value. To fix
this, initialize a local Target ID variable to an invalid value and change that
variable to a valid value only if the pointer to the Target ID is not NULL.

- No need to set the MPSSAS_SHUTDOWN flag because it's never used.

- done_ccb pointer can be used if it is NULL.

To prevent this, move check for done_ccb == NULL to before done_ccb is used in
mpssas_stop_unit_done().

- Disks can go missing until a reboot is done in some cases.

This is due to the DevHandle not being released, which causes the Firmware to
not allow that disk to be re-added.

Reviewed by: ken
Approved by: re (gjb), ken, scottl, ambrisko (mentors)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6872


# 453130d9 02-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: minor spelling fixes.

Most affect comments, very few have user-visible effects.


# a92fe027 14-Dec-2015 Alan Somers <asomers@FreeBSD.org>

Don't retry SAS commands in response to protocol errors

sys/dev/mpr/mpr_sas_lsi.c
sys/dev/mps/mps_sas_lsi.c
When mp[rs]sas_get_sata_identify returns
MPI2_IOCSTATUS_SCSI_PROTOCOL_ERROR, don't bother retrying. Protocol
errors aren't likely to be fixed by sleeping.

Without this change, a system that generated may protocol errors due
to signal integrity issues was taking more than an hour to boot, due
to all the retries.

Reviewed by: slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D4553


# ef065d89 24-Feb-2015 Stephen McConnell <slm@FreeBSD.org>

- Updated all files with 2015 Avago copyright, and updated LSI's copyright
dates.

- Changed all of the PCI device strings from LSI to Avago Technologies (LSI).

- Added a sysctl variable to control how StartStopUnit behavior works. User can
select to spin down disks based on if disk is SSD or HDD.

- Inquiry data is required to tell if a disk will support SSU at shutdown or
not. Due to the addition of mpssas_async, which gets Advanced Info but not
Inquiry data, the setting of supports_SSU was moved to the
mpssas_scsiio_complete function, which snoops for any Inquiry commands. And,
since disks are shutdown as a target and not a LUN, this process was
simplified by basing it on targets and not LUNs.

- Added a sysctl variable that sets the amount of time to retry after sending a
failed SATA ID command. This helps with some bad disks and large disks that
require a lot of time to spin up. Part of this change was to add a callout to
handle timeouts with the SATA ID command. The callout function is called
mpssas_ata_id_timeout(). (Fixes PR 191348)

- Changed the way resets work by allowing I/O to continue to devices that are
not currently under a reset condition. This uses devq's instead of simq's and
makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function
called mpssas_prepare_tm().

- Some changes were made to reduce code duplication when getting a SAS address
for a SATA disk.

- Fixed some formatting and whitespace.

- Bump version of mps driver to 20.00.00.00-fbsd

PR: 191348
Reviewed by: ken, scottl
Approved by: ken, scottl
MFC after: 2 weeks


# 7571e7f6 30-Jul-2014 Steven Hartland <smh@FreeBSD.org>

Bring in LSI's phase16 - phase18 changes
* Implements Start Stop Unit for SATA direct-attach devices in IR mode to avoid
data corruption.
* Use CAM_DEV_NOT_THERE instead of CAM_SEL_TIMEOUT and CAM_TID_INVALID

Obtained from: LSI
MFC after: 2 weeks


# cd04f04f 12-Sep-2013 Kenneth D. Merry <ken@FreeBSD.org>

Fix an issue that caused Integrated RAID volumes on LSI mps(4) controllers
to not get scanned on boot.

The problem originated in change 253549. With the change to the mps(4)
driver to scan only targets that it knows it has (as opposed to scanning
the entire bus), scanning RAID volumes on boot was omitted.

So, for versions of FreeBSD that have the scanning changes
(__FreeBSD_version 1000039 and higher), scan RAID volumes that are added
whether or not we're booting.

PR: kern/181784
Reported by: Xiguang Wang <kurapica@gmail.com>
Tested by: Dennis Glatting <dg@pki2.com>
Sponsored by: Spectra Logic
Approved by: re (delphij)
MFC After: 3 days


# d9802deb 08-Aug-2013 Scott Long <scottl@FreeBSD.org>

Sometimes a device misbehaves so badly that it disrupts the entire system.
Add a tunable that allows such a device to be excluded from the driver.
The id parameter is the target id that the driver assigns to a given device.

dev.mps.X.exclude_ids=<id>,<id>

Obtained from: Netflix
MFC after: 3 days


# 9b91b192 22-Jul-2013 Kenneth D. Merry <ken@FreeBSD.org>

Merge in phase 14+ -> 16 mps driver fixes from LSI:

---------------------------------------------------------------
System panics during a Port reset with ouststanding I/O
---------------------------------------------------------------
It is possible to call mps_mapping_free_memory after this
memory is already freed, causing a panic. Removed this extra
call to mps_mappiing_free_memory and call mps_mapping_exit
in place of the mps_mapping_free_memory call so that any
outstanding mapping items can be flushed before memory is
freed.

---------------------------------------------------------------
Correct memory leak during a Port reset with ouststanding I/O
---------------------------------------------------------------
In mps_reinit function, the mapping memory was not being
freed before being re-allocated. Added line to call the
memory free function for mapping memory.

---------------------------------------------------------------
Use CAM_SIM_QUEUED flag in Driver IO path.
---------------------------------------------------------------
This flag informs the XPT that successful abort of a CCB
requires an abort ccb to be issued to the SIM. While
processing SCSI IO's, set the CAM_SIM_QUEUED flag in the
status for the IO. When the command completes, clear this
flag.

---------------------------------------------------------------
Check for CAM_REQ_INPROG in I/O path.
---------------------------------------------------------------
Added a check in mpssas_action_scsiio for the In Progress
status for the IO. If this flag is set, the IO has already
been aborted by the upper layer (before CAM_SIM_QUEUED was
set) and there is no need to send the IO. The request will
be completed without error.

---------------------------------------------------------------
Improve "doorbell handshake method" for mps_get_iocfacts
---------------------------------------------------------------
Removed call to get Port Facts since this information is
not used currently.

Added mps_iocfacts_allocate function to allocate memory
that is based on IOC Facts data. Added mps_iocfacts_free
function to free memory that is based on IOC Facts data.
Both of the functions are used when a Diag Reset is performed
or when the driver is attached/detached. This is needed in
case IOC Facts changes after a Diag Reset, which could
happen if FW is upgraded.

Moved call of mps_bases_static_config_pages from the attach
routine to after the IOC is ready to process accesses based
on the new memory allocations (instead of polling through
the Doorbell).

---------------------------------------------------------------
Set TimeStamp in INIT message in millisecond format Set the IOC
---------------------------------------------------------------

---------------------------------------------------------------
Prefer mps_wait_command to mps_request_polled
---------------------------------------------------------------
Instead of using mps_request_polled, call mps_wait_command
whenever possible. Change the mps_wait_command function to
check the current context and either use interrupt context
or poll if required by using the pause or DELAY function.
Added a check after waiting 50mSecs to see if the command
has timed out. This is only done if polliing, the msleep
command will automatically timeout if the command has taken
too long to complete.

---------------------------------------------------------------
Integrated RAID: Volume Activation Failed error message is
displayed though the volume has been activated.
---------------------------------------------------------------
Instead of failing an IOCTL request that does not have a
large enough buffer to hold the complete reply, copy as
much data from the reply as possible into the user's buffer
and log a message saying that the user's buffer was smaller
than the returned data.

---------------------------------------------------------------
mapping_add_new_device failure due to persistent table FULL
---------------------------------------------------------------
When a new device is added, if it is determined that the
device persistent table is being used and is full, instead
of displaying a message for this condition every time, only
log a message if the MPS_INFO bit is set in the debug_flags.

Submitted by: LSI
MFC after: 1 week


# b01773b0 22-Jul-2013 Kenneth D. Merry <ken@FreeBSD.org>

CAM and mps(4) driver scanning changes.

Add a PIM_NOSCAN flag to the CAM path inquiry CCB. This tells CAM
not to perform a rescan on a bus when it is registered.

We now use this flag in the mps(4) driver. Since it knows what
devices it has attached, it is more efficient for it to just issue
a target rescan on the targets that are attached.

Also, remove the private rescan thread from the mps(4) driver in
favor of the rescan thread already built into CAM. Without this
change, but with the change above, the MPS scanner could run before
or during CAM's initial setup, which would cause duplicate device
reprobes and announcements.

sys/param.h:
Bump __FreeBSD_version to 1000039 for the inclusion of the
PIM_RESCAN CAM path inquiry flag.

sys/cam/cam_ccb.h:
sys/cam/cam_xpt.c:
Added a PIM_NOSCAN flag. If a SIM sets this in the path
inquiry ccb, then CAM won't rescan the bus in
xpt_bus_regsister.

sys/dev/mps/mps_sas.c
For versions of FreeBSD that have the PIM_NOSCAN path
inquiry flag, don't freeze the sim queue during scanning,
because CAM won't be scanning this bus. Instead, hold
up the boot. Don't call mpssas_rescan_target in
mpssas_startup_decrement; it's redundant and I don't
know why it was in there.

Set PIM_NOSCAN in path inquiry CCBs.

Remove methods related to the internal rescan daemon.

Always use async events to trigger a probe for EEDP support.
In older versions of FreeBSD where AC_ADVINFO_CHANGED is
not available, use AC_FOUND_DEVICE and issue the
necessary READ CAPACITY manually.

Provide a path to xpt_register_async() so that we only
receive events for our own SCSI domain.

Improve error reporting in cases where setup for EEDP
detection fails.

sys/dev/mps/mps_sas.h:
Remove softc flags and data related to the scanner thread.

sys/dev/mps/mps_sas_lsi.c:
Unconditionally rescan the target whenever a device is added.

Sponsored by: Spectra Logic
MFC after: 1 week


# 1610f95c 18-Jul-2013 Scott Long <scottl@FreeBSD.org>

Overhaul error, information, and debug logging.

Obtained from: Netflix
MFC after: 3 days


# 1d50dd1f 18-Jul-2012 Christian Brueffer <brueffer@FreeBSD.org>

Fix a small memory leak in mpssas_get_sata_identify(). The change has been
submitted upstream as well.

Reviewed by: ken, scottl
Obtained from: DragonFly BSD (change df8658e030226dd015cff9749452666d8fe1e87b)
MFC after: 5 days


# be4aa869 27-Jun-2012 Kenneth D. Merry <ken@FreeBSD.org>

Bring in LSI's latest mps(4) 6Gb SAS and WarpDrive driver, version
14.00.00.01-fbsd.

Their description of the changes is as follows:

1. Copyright contents has been changed in all respective .c
and .h files

2. Support for WRITE12 and READ12 for direct-io (warpdrive only)
has been added.

3. Driver has added checks to see if Drive has READ_CAP_16
support before sending it down to the device.
If SPC3_SID_PROTECT flag is set in the inquiry data, the
device supports protection information, and must support
the 16 byte read capacity command, otherwise continue without
sending read cap 16. This will optimize driver performance,
since it will not send READ_CAP_16 to the drive which does
not have support of READ_CAP_16.

4. With new approach, "MPTIOCTL_RESET_ADAPTER" IOCTL will not
use DELAY() which is busy loop implementation.
It will use <msleep> (Better way to sleep without busy
loop). Also from the HBA reset code path and some other
places, DELAY() is replaced with msleep() or "pause()",
which is based on sleep/wakeup style calls. Driver use
msleep()/pause() instead of DELAY based on CAN_SLEEP/NO_SLEEP
flags to avoid busy loop which is not required all the
time.e.a

a. While driver is getting loaded, driver calls most of the
commands with NO_SLEEP.
b. When Driver is functional and it needs Reinit of HBA,
CAN_SLEEP flag is used.

5. <mpslsi> driver is not Endian safe. It will not work on Big
Endian machines like Sparc and PowerPC platforms because it
assumes it is running on a Little Endian machine.

Driver code is modified such way that it does not assume CPU
arch is Little Endian.
a. All places where Driver interacts from HBA to Host, it
converts Little Endian format to CPU format.
b. All places where Driver interacts from Host to HBA, it
converts CPU format to Little Endian.

6. Findout memory leaks in FreeBSD Driver and resolve those,
such as memory leak in targ's luns creation/deletion.
Also added additional checks to see memory allocation
success/fail.

7. Add loginfo prints as debug message, i.e. When FW sends any
loginfo, Driver should print those as debug message.
This will help for debugging purpose.

8. There is possibility to get config request timeout. Current
driver is able to detect config request timetout, but it does
not do anything on config_request timeout. Driver should
call mps_reinit() if any request_poll (which is called as
part of config_request) is time out.

9. cdb length check is required for 32 byte CDB. Add correct mpi
control value for 32 bit CDB as below while submitting SCSI IO
Request to controller.
mpi_control |= 4 << MPI2_SCSIIO_CONTROL_ADDCDBLEN_SHIFT;

10. Check the actual status of Message unit reset
(mps_message_unit_reset).Previously FreeBSD Driver just writes
MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET and never check the ack
(it just wait for 50 millisecond). So, Driver now check the
status of "MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET" after writing
it to the FW.

Now it also checking for whether doorbell ack uses msleep with
proper sleep flags, instead of <DELAY>.

11. Previously CAM does not detect Multi-Lun Devices. In order to
detect Multi-Lun Devices by CAM the driver needs following change
set:
a. There is "max_lun" field which Driver need to set based on
hw/fw support. Currently LSI released driver does not set
this field.
b. Default of "max_lun" should not be 0 in OS, but it is
currently set to 0 in CAM layer.
c. Export max_lun capacity to 255

12. Driver will not reset target info after port enable complete and
also do Device removal when Device remove from FW. The detail
description is as follows
a. When Driver receive WD PD add events, it will add all
information in driver local data structure.
b. Only for WD, we have below checks after port enable
completes, where driver clear off all information retrieved
at #1.
if ((sc->WD_available &&
(sc->WD_hide_expose == MPS_WD_HIDE_ALWAYS)) ||
(sc->WD_valid_config && (sc->WD_hide_expose ==
MPS_WD_HIDE_IF_VOLUME)) {
// clear off target data structure.
}
It is mainly not to attach PDs to OS.

FreeBSD does bus rescan as older Parallel scsi style. So Driver
needs to handle which Drive is visible to OS. That is a reason
we have to clear off targ information for PDs.

Again, above logic was implemented long time ago. Similar concept
we have for non-wd also. For that, LSI have introduced different
logic to hide PDs.

Eventually, because of above gap, when Phy goes offline, we
observe below failure. That is what Driver is not doing complete
removal of device with FW. (which was pointed by Scott)
Apr 5 02:39:24 Freebsd7 kernel: mpslsi0: mpssas_prepare_remove
Apr 5 02:39:24 Freebsd7 kernel: mpssas_prepare_remove 497 : invalid handle 0xe

Now Driver will not reset target info after port enable complete
and also will do Device removal when Device remove from FW.

13. Returning "CAM_SEL_TIMEOUT" instead of "CAM_TID_INVALID"
error code on request to the Target IDs that have no devices
conected at that moment. As if "CAM_TID_INVALID" error code
is returned to the CAM Layaer then it results in a huge chain
of errors in verbose kernel messages on boot and every
hot-plug event.

Submitted by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
MFC after: 3 days


# 653c521f 08-Feb-2012 Kenneth D. Merry <ken@FreeBSD.org>

Bring in a number of mps(4) driver fixes from LSI:

1. Fixed timeout specification for the msleep in mps_wait_command().
Added 30 second timeout for mps_wait_command() calls in mps_user.c.

2. Make sure we call mps_detach_user() from the kldunload path.

3. Raid Hotplug behavior change.

The driver now removes a volume when it goes to a failed state,
so we also need to add volume back to the OS when it goes to
opitimal/degraded/online from failed/missing.

Handle raid volume add and remove from the IR_Volume event.
4. Added some more debugging information.

5. Replace xpt_async(AC_LOST_DEVICE, path, NULL) with
mpssas_rescan_target().

This is to work around a panic in CAM that shows up when adding a
drive with a rescan and removing another device from the driver thread
with an AC_LOST_DEVICE async notification.

This problem was encountered in testing with the LSI sas2ircu utility,
which was used to create a RAID volume from physical disks. The driver
has to create the RAID volume target and remove the physical disk
targets, and triggered a panic in the process.

The CAM issue needs to be fully diagnosed and fixed, but this works
around the issue for now.

6. Fix some memory initialization issues in mps_free_command().

7. Resolve the "devq freeze forever" issue. This was caused by the
internal read capacity command issued in the non-head version of the
driver. When the command completed with an error, the driver wasn't
unfreezing thd device queue.

The version in head uses the CAM infrastructure for getting the read
capacity information, and therefore doesn't have the same issue.

8. Bump the version to 13.00.00.00-fbsd. (this is very close to LSI's
internal stable driver 13.00.00.00)

Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
MFC after: 3 days


# d043c564 26-Jan-2012 Kenneth D. Merry <ken@FreeBSD.org>

Bring in the LSI-supported version of the mps(4) driver.

This involves significant changes to the mps(4) driver, but is not a
complete rewrite.

Some of the changes in this version of the driver:
- Integrated RAID (IR) support.
- Support for WarpDrive controllers.
- Support for SCSI protection information (EEDP).
- Support for TLR (Transport Level Retries), needed for tape drives.
- Improved error recovery code.
- ioctl interface compatible with LSI utilities.

mps.4: Update the mps(4) driver man page somewhat for the driver
changes. The list of supported hardware still needs to be
updated to reflect the full list of supported cards.

conf/files: Add the new driver files.

mps/mpi/*: Updated version of the MPI header files, with a BSD style
copyright.

mps/*: See above for a description of the new driver features.

modules/mps/Makefile:
Add the new mps(4) driver files.

Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
Reviewed by: ken
MFC after: 1 week