#
8c4ee0b2 |
|
23-Nov-2023 |
Alexander Motin <mav@FreeBSD.org> |
Use xpt_path_sbuf() in few drivers xpt_path_string() is now a wrapper around xpt_path_sbuf(). Using it to than concatenate result to another sbuf makes no sense. Just call xpt_path_sbuf() directly. MFC after: 1 month
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
59dc489a |
|
20-Jul-2023 |
Warner Losh <imp@FreeBSD.org> |
mpr: Fix minor 'typos' comment moving -> removing (we're removing the device) CAM_REQ_CMO_ERROR -> CAM_REQ_ERR (the former isn't a thing) Sponsored by: Netflix
|
#
ca420b4e |
|
28-Apr-2022 |
Warner Losh <imp@FreeBSD.org> |
mpr/mps: when sending reset on removal, include target in message It's possible for muliple drives to be departing at the same time (if the common power rail the share goes dark, for example). To understand what's going on better, include target and handle in the messages announcing the reset to allow matching with other corresponding events. MFC After: 3 days Sponsored by: Netflix Reviewed by: mav Differential Revision: https://reviews.freebsd.org/D35092
|
#
e35816c1 |
|
25-Jan-2022 |
Warner Losh <imp@FreeBSD.org> |
mpr/mps: Fix a race in diagnostic reset There's a small race in freezing the simq when performing a diagnostic reset. During this time, a transaction can slip through and encounter the target id of 0. If we're still in diagnostic reset when we detect this, return a CAM_DEVICE_NOT_THERE status. Instead, freeze the queue and return a requeue status, similar to what we do when we're resetting a target and a transaction get here. The race is unavoidable due to separate locks for queue and SIM, but easy enough to detect and make harmless. Sponsored by: Netflix Reviewed by: scottl, mav Differential Revision: https://reviews.freebsd.org/D34017
|
#
802f8d4a |
|
24-Jan-2022 |
Warner Losh <imp@FreeBSD.org> |
mpr/mps: Remove write-only flag and callout The discovery callout is initialized and cancelled only, making it write-only. Remove a state flag associated with it being pending as well as two defines that aren't used that are associated with it. Remove MP?SAS_SHUTDOWN flag, which is unused. Sponsored by: Netflix Reviewed by: ken, scottl, mav Differential Revision: https://reviews.freebsd.org/D33925
|
#
61f17c5f |
|
24-Nov-2021 |
Scott Long <scottl@FreeBSD.org> |
Fix "set but not used" warnings in the mpr driver. This fixes a minor bug in error handling.
|
#
a8837c77 |
|
21-Nov-2021 |
Warner Losh <imp@FreeBSD.org> |
mpr: fix freeze / release mismatch in timeout code So, if we're processing a timeout, and we've sent an ABORT to the firmware for that timeout, but not yet received the response from the firmware, AND we get another timeout, we queue the timeout and freeze the queue. However, when we've finally processed them all, we only release the queue once. This causes all I/O to halt as the devq remains frozen forever. Instead, only freeze the queue when we start the process (eg set INRESET on the target). This will allow the release when all the timed out I/Os have finished ABORTing. Sponsored by: Netflix Reviewed by: mav Differential Revision: https://reviews.freebsd.org/D33054
|
#
2bbaed4d |
|
15-Nov-2021 |
Warner Losh <imp@FreeBSD.org> |
mpr: Minor formatting changes to match mps. Minor reformatting nits to make mprsas_scsiio_timeout match mpssas_scsiio_timeout more closely. The differences aren't necessary and are distracting when comparing the routines. No functional changes. Sponsored by: Netflix
|
#
02d81940 |
|
14-Sep-2021 |
Alexander Motin <mav@FreeBSD.org> |
mps/mpr(4): Move xpt_register_async() out of lock. It fixes lock ordere reversal between SIM and device locks. Also remove registration for AC_FOUND_DEVICE, unused for a while now. MFC after: 1 month
|
#
9781c28c |
|
20-Aug-2021 |
Alexander Motin <mav@FreeBSD.org> |
mpr(4): Fix unmatched devq release. Before this change devq was frozen only if some command was sent to the target after reset started, but release was called always. This change freezes the devq immediately, leaving mprsas_action_scsiio() check only to cover race condition due to different lock devq use. This should also avoid unnecessary requeue of the commands, creating additional log noise and confusing some broken apps. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
e3c5965c |
|
20-Aug-2021 |
Alexander Motin <mav@FreeBSD.org> |
mpr(4): Handle mprsas_alloc_tm() errors on device removal. SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of only 8, while for comparison some other combinations I have report 100 or even 128. In case of large JBOD detach requirement to send target reset command to each target same time overflows the limit, and without adequate handling makes devices stuck in half-detached state, preventing later re-attach. To handle that in case of allocation error mark the target with new MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time something else free high priority command. With this patch I can successfully detach/attach 102 disk JBOD from/to the SAS9305-16e. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
175ad3d0 |
|
03-Jun-2021 |
Kenneth D. Merry <ken@FreeBSD.org> |
Fix mpr(4) and mps(4) state transitions and a use-after-free panic. When the mpr(4) and mps(4) drivers probe a SATA device, they issue an ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the target is fully setup in the driver. The drivers wait for completion of the identify command, and have a 5 second timeout. If the timeout fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be freed later. That is where the use-after-free problem comes in. Once the ATA Identify times out, the driver sends a target reset, and then frees any identify commands that have timed out. But, once the target reset completes, commands that were queued to the drive are returned to the driver by the controller. At that point, the driver (in mp{s,r}_intr_locked()) looks up the command descriptor for that particular SMID, marks it CM_STATE_BUSY and sends it on for completion handling. The problem at this stage is that the command has already been freed, and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS are turned on, we get a panic as soon as this command is allocated, because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY. So, the solution is to not free ATA Identify commands that get stuck until they actually return from the controller. Hopefully this works correctly on older firmware versions. If not, it could result in commands hanging around indefinitely. But, the alternative is a use-after-free panic or assertion (in the INVARIANTS case). This also tightens up the state transitions between CM_STATE_FREE, CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen once, and we have assertions to make sure that commands are in the correct state before transitioning to the next state. Also, for each state assertion, we print out the current state of the command if it is incorrect. mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes, that controls the behavior of the dump_reqs sysctl. If dump_reqs_alltypes is non-zero, it will dump all commands, not just the commands that are in the CM_STATE_INQUEUE state. (You can see the commands that are in the queue by using mp{s,r}util debug dumpreqs.) Make sure that the INQUEUE -> BUSY state transition happens in one place, the mp{s,r}_complete_command routine. mp{s,r}_sas.c: Make sure we print the current command type in command state assertions. mp{s,r}_sas_lsi.c: Add a new completion handler, mp{s,r}sas_ata_id_complete. This completion handler will free data allocated for an ATA Identify command and free the command structure. In mp{s,r}_ata_id_timeout, do not set the command state to CM_STATE_BUSY. The command is still in queue in the controller. Since we were blocking waiting for this command to complete, there was no completion handler previously. Set the completion handler, so that whenever the command does come back, it will get freed properly. Do not free ATA Identify commands that have timed out in mp{s,r}sas_add_device(). Wait for them to actually come back from the controller. mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new dump_reqs_alltypes sysctl. Make sure we print the current state for state transition asserts. This was tested in the Spectra Logic test bed (as described in the review), as well Netflix's Open Connect fleet (where panics dropped from a dozen or two a month to zero). Reviewed by: imp@ (who is handling the commit with ken's OK) Sponsored by: Spectra Logic Differential Revision: https://reviews.freebsd.org/D25476
|
#
7608b98c |
|
21-May-2021 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
mpr, mps: clear CCBs allocated on the stack Reviewed By: imp Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D30301
|
#
71900a79 |
|
02-Mar-2021 |
Alfredo Dal'Ava Junior <alfredo@FreeBSD.org> |
mpr: big-endian support This fixes mpr driver on big-endian devices. Tested on powerpc64 and powerpc64le targets using a SAS9300-8i card (LSISAS3008 pci vendor=0x1000 device=0x0097) Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br> Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email) Sponsored by: Eldorado Research Institute (eldorado.org.br) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25785
|
#
88364968 |
|
25-Oct-2020 |
Alexander Motin <mav@FreeBSD.org> |
Introduce support of SCSI Command Priority. SAM-3 specification introduced concept of Task Priority, that was renamed to Command Priority in SAM-4, and supported by all modern SCSI transports. It provides 15 levels of relative priorities: 1 - highest, 15 - lowest and 0 - default. SAT specification for SATA devices translates priorities 1-3 into NCQ high priority. This change adds new "priority" field into empty spots of struct ccb_scsiio and struct ccb_accept_tio of CAM and struct ctl_scsiio of CTL. Respective support is added into iscsi(4), isp(4), mpr(4), mps(4) and ocs_fc(4) drivers for both initiator and where applicable target roles. Minimal support was added to CTL to receive the priority value from different frontends, pass it between HA controllers and report in few places. This patch does not add consumers of this functionality, so nothing should really change yet, since the field is still set to 0 (default) on initiator and not actively used on target. Those are to be implemented separately. I've confirmed priority working on WD Red SATA disks connected via mpr(4) and properly transferred to CTL target via iscsi(4), isp(4) and ocs_fc(4). While there, added missing tag_action support to ocs_fc(4) initiator role. MFC after: 1 month Relnotes: yes Sponsored by: iXsystems, Inc.
|
#
577858c8 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
mpr: clean up empty lines in .c and .h files
|
#
f0f20143 |
|
04-Aug-2020 |
Alexander Motin <mav@FreeBSD.org> |
Remove extra memset() left after r342388. This memset() wiped MPI2_FUNCTION_SCSI_TASK_MGMT set by mprsas_alloc_tm(), that broke target reset on device removal, making later re-insertion into the same slot impossible, since firmware was still waiting for the driver to finish with the removed device. MFC after: 1 week Sponsored by: iXsystems, Inc.
|
#
d2a5f081 |
|
27-Jul-2020 |
Mark Johnston <markj@FreeBSD.org> |
mpr(4), mps(4): Stop checking for failures from malloc(M_WAITOK). PR: 240545 Submitted by: Andrew Reiter <arr@watson.org> Reviewed by: imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25766
|
#
a2386b6f |
|
13-Mar-2020 |
Alexander Motin <mav@FreeBSD.org> |
Increase buffer in mprsas_log_command() from 192 to 224 bytes. 192 bytes are not enough to print long commands, such as ATA COMMAND PASS THROUGH(16), that makes debug output difficult to read. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
0d87f3c7 |
|
26-Feb-2020 |
Warner Losh <imp@FreeBSD.org> |
Remove support for all pre FreeBSD 11.0 versions from mpr and mps. Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10 was branched over 6 years ago. All of these changes date from about that time or earlier. These workarounds are extensive and get in the way of understanding the current flow in the driver.
|
#
4c1cdd4a |
|
24-Feb-2020 |
Warner Losh <imp@FreeBSD.org> |
Before issing the REMOVE_DEVICE command to the firmware, make sure that all commands have completed. It's not OK to force complete any pending commands before we send the REMOVE_DEVICE. Instead, make sure that all pending commands are complete before sending that. By trying to second guess the firmware here, we run the risk of completing commands twice, which leads to corruption. This removes the forced completion of commands introduced in r218811. So it's a partial backout of that commit, but replaces it with a more rebust mechanism. Either these commands will complete due to the TARGET RESET, or they will timeout and be aborted, but they will all complete. Add assert that all commands are complete to REMOVE_DEVICE completion routine. We attempt to assure this programatically, so we shouldn't have any commands in the queue because we've waited for them all. Any commands that make it into our action routine after we mark the target in removal will complete immediately with an error. When we're removing a target that's not a volume, advertise up the stack that it's actually gone, as opposed to having a transient selection error we should retry. Do this both in the action routine, and when we get a notification of an aborted command. We don't do this for volumes because the driver tries hard not to advertise to the OS a volume has disappeared. Apply these changes to both mpr and mps since they are based on quite similar designs. Discussed with: scottl@ Differential Revision: https://reviews.freebsd.org/D23768
|
#
57aa9163 |
|
24-Nov-2019 |
Warner Losh <imp@FreeBSD.org> |
Fix leak in state machine for commands. When we get a device departed message from the firmware, we send a TARGET_REST to the device to let the firmware know we're done and as part of the recovery process. This will abort all the commands. While the documentation says the IOC is responsible for writing the completion message for all the commands pending with an aborted status, we sometimes have queued commands for the target that haven't been completed so are in the INQUEUE state. So, when we later complete the pending CCB as aborted, these commands are freed and we hit the "state not busy" panic. Elsewhere where we dequeue commands, we move the state to BUSY from INQUEUE. Do that here as well. In talking to Ken, Scott and Justin, they recommended a series of tests to see if this is 100% safe. Those tests are ongoing, but preliminary tests suggest this is safe as we see no duplicate completions when we hit this case at work. We have a machine that has a dodgy powersupply which usually doesn't apply power to a few drives, but sometimes does when the machine is under heavy load so we get a rash of the connect / disconnect messages over half an hour. Without this change, we'd see state not busy panic. With this change, the drives just annoyingly come and go without affecting the rest of the machine, but without a complete error injection test suite, it's hard to know if all edge cases are now covered or not. Discussed with: scottl, ken, gibbs
|
#
8fe7bf06 |
|
08-Jul-2019 |
Warner Losh <imp@FreeBSD.org> |
Fix bugs in recovery path and improve cm tracking Eliminate the TIMEDOUT state. This state really conveyed two different concepts: I timed out during recovery (and my command got put on the recovery queue), and I timed out diring discovery (which doesn't). Separate those two concepts into two flags. Use the TIMEDOUT flag to fail requests as timed out. Use the on queue flag to remove them from the queue. In mps_intr_locked for MPI2_RPY_DESCRIPT_FLAGS_ADDRESS_REPLY message type, when completing commands, ignore the ones that are not in state INQUEUE. They were already completed as part of the recovery process. When we complete them twice, we wind up with entries on the free queue that are marked as busy, trigging asserts. Reviewed by: scottl (earlier version, just for mpr) Differential Revision: https://reviews.freebsd.org/D20785
|
#
46b23587 |
|
26-Dec-2018 |
Kashyap D Desai <kadesai@FreeBSD.org> |
Update copyright information Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc
|
#
89d1c21f |
|
26-Dec-2018 |
Kashyap D Desai <kadesai@FreeBSD.org> |
Added support for NVMe Task Management Following list of changes done in the driver as a part of TM handling on the NVMe drives. Below changes are only applicable on NVMe drives and only when custom NVMe TM handling bit is set to zero by IOC. 1. Issue LUN reset & Target reset TMs with Target reset method field set to Protocol Level reset (0x3), 2. For LUN & target reset TMs use the timeout value as ControllerResetTO value provided by firmware using PCie Device Page 0, 3. If LUN reset fails to terminates the IO then directly escalate to host reset instead of going for target reset TM, 4. For Abort TM use the timeout value as NVMeAbortTO value given by the IOC using Manufacturing Page 11, 5. Log message "PCie Host Reset failed" message up on receiving P Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc
|
#
46b9415f |
|
23-Dec-2018 |
Scott Long <scottl@FreeBSD.org> |
Further refactoring for task management commands. Also fix a related typo from the previous commit.
|
#
3921a9f7 |
|
23-Dec-2018 |
Scott Long <scottl@FreeBSD.org> |
Commands for user-initated device resets should come from the high-priority allocator. Prior to this change, they would leak from the normal allocator.
|
#
b7f1ee79 |
|
23-Dec-2018 |
Scott Long <scottl@FreeBSD.org> |
First step in refactoring and fixing the error recovery and task management code in the mpr and mps drivers. Eliminate duplicated code and fix some comments.
|
#
8277ce2b |
|
21-Dec-2018 |
Conrad Meyer <cem@FreeBSD.org> |
mps(4), mpr(4): Fix lifetime of command buffer for mp?sas_get_sata_identify In the event that the ID command timed out, mps(4)/mpr(4) did not free the command until it could be cancelled. However, it freed the associated buffer (cm_data). Fix the lifetime issue by freeing the associated buffer only after Abort Task or controller reset. Reviewed by: scottl Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18612
|
#
9544e6dc |
|
21-Aug-2018 |
Chuck Tuffli <chuck@FreeBSD.org> |
Make NVMe compatible with the original API The original NVMe API used bit-fields to represent fields in data structures defined by the specification (e.g. the op-code in the command data structure). The implementation targeted x86_64 processors and defined the bit fields for little endian dwords (i.e. 32 bits). This approach does not work as-is for big endian architectures and was changed to use a combination of bit shifts and masks to support PowerPC. Unfortunately, this changed the NVMe API and forces #ifdef's based on the OS revision level in user space code. This change reverts to something that looks like the original API, but it uses bytes instead of bit-fields inside the packed command structure. As a bonus, this works as-is for both big and little endian CPU architectures. Bump __FreeBSD_version to 1200081 due to API change Reviewed by: imp, kbowling, smh, mav Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D16404
|
#
8881681b |
|
23-Mar-2018 |
Kenneth D. Merry <ken@FreeBSD.org> |
Disable T10 Protection Information / EEDP handling for type 2 protection. The mps(4) and mpr(4) drivers and hardware handle T10 Protection Information, which is a system of checksums and guard blocks to protect data while it is being transferred and while it is on disk. It is also known as T10 DIF. For more details, see section 4.22 of the SBC-4 spec. Supporting Type 2 protection requires using 32 byte CDBs, and filling in the fields in those CDBs. We don't yet support that in the da(4) driver. Type 1 and Type 3 protection don't require that, and can be handled by the mps(4)/mpr(4) driver's code and firmware without any additional input from the da(4) driver. If a drive has Type 2 protection enabled (you frequently see this with SAS drives shipped from Dell), don't set the various EEDP fields in the mps(4)/mpr(4) driver command fields. Otherwise, you wind up with errors like this that would otherwise make no sense: (da9:mpr0:0:18:0): READ(10). CDB: 28 00 00 00 00 00 00 02 00 00 (da9:mpr0:0:18:0): CAM status: SCSI Status Error (da9:mpr0:0:18:0): SCSI status: Check Condition (da9:mpr0:0:18:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code) (da9:mpr0:0:18:0): (da9:mpr0:0:18:0): Field Replaceable Unit: 0 (da9:mpr0:0:18:0): Command Specific Info: 0 (da9:mpr0:0:18:0): (da9:mpr0:0:18:0): Descriptor 0x80: f8 21 (da9:mpr0:0:18:0): Descriptor 0x81: 00 00 00 00 00 00 (da9:mpr0:0:18:0): Error 22, Unretryable error In other words, what kind of strange SAS hard drive doesn't support a standard 10 byte SCSI READ command? In this case, one that has Type 2 protection enabled. We can revisit this when we put Type 2 protection support in the da(4) driver, but for now this will help people who put Type 2 formatted drives in a system and wonder what in the world is going on. MFC after: 3 days Sponsored by: Spectra Logic
|
#
5f5baf0e |
|
19-Mar-2018 |
Alexander Motin <mav@FreeBSD.org> |
Update mpr(4) driver from v15 to v18 from Broadcom site. Version 16 is just a number bump, since we already had those changes. Version 17 introduces new AdapterType value, that allows new user-space tools from Broadcom to differentiate adapter generations 3 and 3.5. Version 18 updates headers and adds SAS_DEVICE_DISCOVERY_ERROR reporting. MFC after: 2 weeks
|
#
0d787e9b |
|
22-Feb-2018 |
Wojciech Macek <wma@FreeBSD.org> |
NVMe: Add big-endian support Remove bitfields from defined structures as they are not portable. Instead use shift and mask macros in the driver and nvmecontrol application. NVMe is now working on powerpc64 host. Submitted by: Michal Stanek <mst@semihalf.com> Obtained from: Semihalf Reviewed by: imp, wma Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D13916
|
#
f0779b04 |
|
18-Feb-2018 |
Scott Long <scottl@FreeBSD.org> |
Improve command lifecycle debugging and detection of problems. Sponsored by: Netflix
|
#
4f5d6573 |
|
09-Feb-2018 |
Alexander Motin <mav@FreeBSD.org> |
Teach mps(4) and mpr(4) drivers to autotune chain frames. This is a first part of the change. It makes the drivers to calculate the required number of chain frames to satisfy worst case scenarios, but it does not change existing overly strict limits on them. The next step will be to rewrite the allocator to not require megabytes of physically contiguous address space, that may be problematic if done after boot, after doing which the limits can be removed. Until that this code can just correct user set limits, if they are set too high. Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D14261
|
#
62a09ee9 |
|
06-Feb-2018 |
Alexander Motin <mav@FreeBSD.org> |
Fix queue length reporting in mps(4) and mpr(4). Both drivers were found to report CAM bigger queue depth then they really can handle. It made them later under high load with many disks return some of submitted requests back with CAM_REQUEUE_REQ status for later resubmission. Reviewed by: scottl MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D14215
|
#
3c5ac992 |
|
10-Sep-2017 |
Scott Long <scottl@FreeBSD.org> |
Add infrastructure for allocating multiple MSI-X interrupts. Also add more fine-tuned controls for allocating requests and replies. Sponsored by: Netflix
|
#
a4bb51a4 |
|
10-Sep-2017 |
Scott Long <scottl@FreeBSD.org> |
Fix intrhook release in MPR and MPS for EARLY_AP_STARTUP. Reported by: Limelight Sponsored by: Netflix
|
#
2bf620cb |
|
09-Sep-2017 |
Scott Long <scottl@FreeBSD.org> |
Convert some in-line printing of diagnostic into tables. Sponsored by: Netflix
|
#
a7d065b3 |
|
09-Sep-2017 |
Scott Long <scottl@FreeBSD.org> |
Remove the unnecessary use of a temporary string buffer. Sponsored by: Netflix
|
#
6eea4f46 |
|
06-Sep-2017 |
Scott Long <scottl@FreeBSD.org> |
Checkpoint the next phase in debug message cleanup, this time focusing on error recovery messages. Sponsored by: Netflix
|
#
757ff642 |
|
27-Aug-2017 |
Scott Long <scottl@FreeBSD.org> |
Start overhauling debug printing in the MPS and MPR drivers. The focus of this commit it to make initiazation less chatty in the normal case, and more useful and informative when real debugging is turned on. Reviewed by: ken (earlier version) Sponsored by: Netflix
|
#
6d4ffcb4 |
|
10-Aug-2017 |
Kenneth D. Merry <ken@FreeBSD.org> |
Changes to make mps(4) and mpr(4) handle reinit with reallocation. When the mps(4) and mpr(4) drivers need to reinitialize the firmware, they sometimes need to reallocate all of the memory allocated by the driver. The reallocation happens whenever the IOC Facts change. That should only happen after a firmware upgrade. If the reinitialization happens as a result of a timed out command sent to the card, the command that timed out and triggered the reinit may have been freed if iocfacts_allocate() reallocated all memory. If the caller attempts to access the command after that, the kernel will panic because the caller will be dereferencing freed memory. The solution is to set a flag in the softc when we reallocate, and avoid dereferencing the command strucure if we've reallocated. The changes are largely the same in both drivers, since mpr(4) is a derivative of mps(4). o In iocfacts_allocate(), if the IOC Facts have changed and we need to reallocate, set the REALLOCATED flag in the softc. o Change wait_command() to take a struct mps_command ** instead of a struct mps_command *. This allows us to NULL out the caller's command pointer if we have to reinit the controller and the data structures get reallocated. (The REALLOCATED flag will be set in the softc if that has happened.) o In every place that calls wait_command(), make sure we handle the case where the command is NULL after the call. o The mpr(4) driver has mpr_request_polled() which can also reinitialize the card. Also check for reallocation there. Reviewed by: scottl, slm MFC after: 1 week Sponsored by: Spectra Logic
|
#
6c85e33e |
|
25-Jul-2017 |
Scott Long <scottl@FreeBSD.org> |
Quiet a message that sounds far more dire than it really is.
|
#
327f2e6c |
|
25-May-2017 |
Stephen McConnell <slm@FreeBSD.org> |
Fix several problems with mapping code. Reviewed by: ken, scottl, asomers, ambrisko, mav Approved by: ken, mav MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10861
|
#
18982e8f |
|
22-May-2017 |
Stephen McConnell <slm@FreeBSD.org> |
Fix powerpc compiler error. Approved by: ken
|
#
67feec50 |
|
17-May-2017 |
Stephen McConnell <slm@FreeBSD.org> |
Add tri-mode support (SAS/SATA/PCIe). This includes NVMe device support and adds support for the following adapters: SAS 3408 SAS 3416 SAS 3508 SAS 3516 SAS 3616 SAS 3708 SAS 3716 Reviewed by: ken, scottl, asomers, mav Approved by: ken, scottl, mav MFC after: 2 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D10095
|
#
855fe445 |
|
11-May-2017 |
Scott Long <scottl@FreeBSD.org> |
Improve error messages during command timeout for the mpr and mps drivers. Sponsored by: Netflix
|
#
c11c484f |
|
19-Jan-2017 |
Scott Long <scottl@FreeBSD.org> |
Rework the debug print API. Event printing no longer gets special handling. All of the printing from the tables file now has wrappers so that the handling is cleaner and it's possible to print something out (say, during development) without having to fight the global debug flags. This re-org will also make it easier to have the tables be compiled out at build time if desired. Other than fixing some minor bugs, there are no user-visible changes from this change Sponsored by: Netflix, Inc. Differential Revision: D9238
|
#
4195c7de |
|
04-Jan-2017 |
Alan Somers <asomers@FreeBSD.org> |
Always null-terminate ccb_pathinq.(sim_vid|hba_vid|dev_name) The sim_vid, hba_vid, and dev_name fields of struct ccb_pathinq are fixed-length strings. AFAICT the only place they're read is in sbin/camcontrol/camcontrol.c, which assumes they'll be null-terminated. However, the kernel doesn't null-terminate them. A bunch of copy-pasted code uses strncpy to write them, and doesn't guarantee null-termination. For at least 4 drivers (mpr, mps, ciss, and hyperv), the hba_vid field actually overflows. You can see the result by doing "camcontrol negotiate da0 -v". This change null-terminates those fields everywhere they're set in the kernel. It also shortens a few strings to ensure they'll fit within the 16-character field. PR: 215474 Reported by: Coverity CID: 1009997 1010000 1010001 1010002 1010003 1010004 1010005 CID: 1331519 1010006 1215097 1010007 1288967 1010008 1306000 CID: 1211924 1010009 1010010 1010011 1010012 1010013 1010014 CID: 1147190 1010017 1010016 1010018 1216435 1010020 1010021 CID: 1010022 1009666 1018185 1010023 1010025 1010026 1010027 CID: 1010028 1010029 1010030 1010031 1010033 1018186 1018187 CID: 1010035 1010036 1010042 1010041 1010040 1010039 Reviewed by: imp, sephe, slm MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D9037 Differential Revision: https://reviews.freebsd.org/D9038
|
#
fa699bb2 |
|
03-Jan-2017 |
Alan Somers <asomers@FreeBSD.org> |
misc minor fixes in mpr(4) sys/dev/mpr/mpr_sas.c * Fix a potential null pointer dereference (CID 1305731) * Check for overrun of the ccb_scsiio.cdb_io.cdb_bytes buffer (CID 1211934) sys/dev/mpr/mpr_sas_lsi.c * Nullify a dangling pointer in mprsas_get_sata_identify * Fix a memory leak in mprsas_SSU_to_SATA_devices (CID 1211935) Reported by: Coverity (partially) CID: 1305731 1211934 1211935 Reviewed by: slm MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D8880
|
#
694cb8b8 |
|
04-Nov-2016 |
Scott Long <scottl@FreeBSD.org> |
Record the LogInfo field when reporting the IOCStatus. Helps in debugging errors. Submitted by: slm Obtained from: Netflix MFC after: 3 days
|
#
32b0a21e |
|
12-Jul-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Use real values to calculate Max I/O size instead of guessing. Reviewed by: ken, scottl Approved by: ken, scottl, ambrisko (mentors) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7043
|
#
58581c13 |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Disks can go missing until a reboot is done in some cases. This is due to the DevHandle not being released, which causes the Firmware to not allow that disk to be re-added. Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6102
|
#
407073a0 |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Use callout_reset_sbt() instead of callout_reset() if FreeBSD ver is >= 1000029 Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6101
|
#
dd9f4a95 |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
No need to set the MPRSAS_SHUTDOWN flag because it's never used. Approved by: ken, scottl, ambrisko MFC after: 1 week
|
#
88619392 |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Fix possible use of invalid pointer. It was possible to use an invalid pointer to get the target ID value. To fix this, initialize a local Target ID variable to an invalid value and change that variable to a valid value only if the pointer to the Target ID is not NULL. Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6100
|
#
d3f6eabf |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
No log bit in IOCStatus and endian-safe changes. Use MPI2_IOCSTATUS_MASK when checking IOCStatus to mask off the log bit, and make a few more things endian-safe. Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6097
|
#
2bbc5fcb |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Add support for the Broadcom (Avago/LSI) 9305 16 and 24 port HBA's. Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Relnotes: yes Differential Revision: https://reviews.freebsd.org/D6098
|
#
7a2a6a1a |
|
09-May-2016 |
Stephen McConnell <slm@FreeBSD.org> |
Several style changes and add copyrights for 2016. Reviewed by: ken, scottl, ambrisko, asomers Approved by: ken, scottl, ambrisko MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6103
|
#
6adfa7ed |
|
05-May-2016 |
Alan Somers <asomers@FreeBSD.org> |
mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors Submitted by: ken Reviewed by: slm MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D6210
|
#
a2c14879 |
|
28-May-2015 |
Stephen McConnell <slm@FreeBSD.org> |
The wrong commit message was given with r283632. This is the correct message. - Updated all files with 2015 Avago copyright, and updated LSI's copyright dates. - Changed all of the PCI device strings from LSI to Avago Technologies (LSI). - Added a sysctl variable to control how StartStopUnit behavior works. User can select to spin down disks based on if disk is SSD or HDD. - Inquiry data is required to tell if a disk will support SSU at shutdown or not. Due to the addition of mpssas_async, which gets Advanced Info but not Inquiry data, the setting of supports_SSU was moved to the mpssas_scsiio_complete function, which snoops for any Inquiry commands. And, since disks are shutdown as a target and not a LUN, this process was simplified by basing it on targets and not LUNs. - Added a sysctl variable that sets the amount of time to retry after sending a failed SATA ID command. This helps with some bad disks and large disks that require a lot of time to spin up. Part of this change was to add a callout to handle timeouts with the SATA ID command. The callout function is called mpssas_ata_id_timeout(). (Fixes PR 191348) - Changed the way resets work by allowing I/O to continue to devices that are not currently under a reset condition. This uses devq's instead of simq's and makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function called mpssas_prepare_tm(). - Some changes were made to reduce code duplication when getting a SAS address for a SATA disk. - Fixed some formatting and whitespace. - Bump version of mps driver to 9.255.01.00-fbsd PR: 191348 Reviewed by: ken, scottl Approved by: ken, scottl MFC after: 1 week
|
#
c91d307f |
|
28-May-2015 |
Stephen McConnell <slm@FreeBSD.org> |
The wrong commit message was given with r283632. To get the correct commit message synced to the changes in r283632, those changes are now backed out. Another commit will be done that is exactly the same as r283632 except it will have to correct commit message. Approved by: ken, scottl, asomers, gibbs
|
#
2ca937d2 |
|
27-May-2015 |
Stephen McConnell <slm@FreeBSD.org> |
This setting of stop_at_shutdown should have been removed with r279253 Approved by: ken MFC after: 1 week
|
#
6c0541fa |
|
26-Feb-2015 |
Kenneth D. Merry <ken@FreeBSD.org> |
Add FreeBSD stable/10 version checks for the availability of the CDAI_FLAG_NONE advanced information CCB flag. Support for the flag was merged to stable/10 in r279329, and the __FreeBSD_version in stable/10 was bumped to 1001510. Check for that version in the mps(4) and mpr(4) drivers when determining whether to use the flag. Sponsored by: Spectra Logic MFC after: 3 days
|
#
e8577fb4 |
|
18-Feb-2015 |
Kenneth D. Merry <ken@FreeBSD.org> |
Make sure that the flags for the XPT_DEV_ADVINFO CCB are initialized properly. If there is garbage in the flags field, it can sometimes include a set CDAI_FLAG_STORE flag, which may cause either an error or perhaps result in overwriting the field that was intended to be read. sys/cam/cam_ccb.h: Add a new flag to the XPT_DEV_ADVINFO CCB, CDAI_FLAG_NONE, that callers can use to set the flags field when no store is desired. sys/cam/scsi/scsi_enc_ses.c: In ses_setphyspath_callback(), explicitly set the XPT_DEV_ADVINFO flags to CDAI_FLAG_NONE when fetching the physical path information. Instead of ORing in the CDAI_FLAG_STORE flag when storing the physical path, set the flags field to CDAI_FLAG_STORE. sys/cam/scsi/scsi_sa.c: Set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE when fetching extended inquiry information. sys/cam/scsi/scsi_da.c: When storing extended READ CAPACITY information, set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_STORE instead of ORing it into a field that isn't initialized. sys/dev/mpr/mpr_sas.c, sys/dev/mps/mps_sas.c: When fetching extended READ CAPACITY information, set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE instead of setting it to 0. sbin/camcontrol/camcontrol.c: When fetching a device ID, set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE instead of 0. sys/sys/param.h: Bump __FreeBSD_version to 1100061 for the new XPT_DEV_ADVINFO CCB flag, CDAI_FLAG_NONE. Sponsored by: Spectra Logic MFC after: 1 week
|
#
85c9dd9d |
|
21-Nov-2014 |
Steven Hartland <smh@FreeBSD.org> |
Prevent overflow issues in timeout processing Previously, any timeout value for which (timeout * hz) will overflow the signed integer, will give weird results, since callout(9) routines will convert negative values of ticks to '1'. For unsigned integer overflow we will get sufficiently smaller timeout values than expected. Switch from callout_reset, which requires conversion to int based ticks to callout_reset_sbt to avoid this. Also correct isci to correctly resolve ccb timeout. This was based on the original work done by Eygene Ryabinkin <rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid the overlow. Differential Revision: https://reviews.freebsd.org/D1157 Reviewed by: mav, davide MFC after: 1 month Sponsored by: Multiplay
|
#
82315915 |
|
08-Oct-2014 |
Alexander Motin <mav@FreeBSD.org> |
Properly report 12Gbps connection rate. Reviewed by: kadesai, slm MFC after: 1 week
|
#
c2a0f07a |
|
24-May-2014 |
Alexander Motin <mav@FreeBSD.org> |
Increase taskqueue thread priority from idle to PRIBIO. Idle priority is not even time-share, so if system is busy in any way, those events may never be executed. Since in some cases system waits for events processed by that thread, that may cause deadlocks.
|
#
a371d6f9 |
|
08-May-2014 |
Kenneth D. Merry <ken@FreeBSD.org> |
Add #ifdefs in the mpr(4) driver so that versions of stable/9 that have implemented the PIM_NOSCAN rescan functionality will have it enabled. This is a no-op for head. Reviewed by: slm Sponsored by: Spectra Logic Corporation MFC after: 3 days
|
#
d2b4e18b |
|
08-May-2014 |
Kenneth D. Merry <ken@FreeBSD.org> |
Fix TLR (Transport Layer Retry) support in the mps(4) and mpr(4) drivers. TLR is necessary for reliable communication with SAS tape drives. This was broken by change 246713 in the mps(4) driver. It changed the cm_data field for SCSI I/O requests to point to the CCB instead of the data buffer. So, instead, look at the CCB's data pointer to determine whether or not we're talking to a tape drive. Also, take the residual into account to make sure that we don't go off the end of the request. MFC after: 3 days Sponsored by: Spectra Logic Corporation
|
#
07aa4de1 |
|
06-May-2014 |
Kenneth D. Merry <ken@FreeBSD.org> |
Fix a problem with async notifications in the mpr(4) driver. This problem only occurs on versions of FreeBSD prior to the recent CAM locking changes. (i.e. stable/9 and older versions of stable/10) This change should be a no-op for head and stable/10. If a path isn't specified, xpt_register_async() will create a fully wildcarded path and acquire a lock (the XPT lock in older versions, and via xpt_path_lock() in newer versions) to call xpt_action() for the XPT_SASYNC_CB CCB. It will then drop the lock and if the requested event includes AC_FOUND_DEVICE or AC_PATH_REGISTERED, it will get the caller up to date with any device arrivals or path registrations. The issue is that before the locking changes, each SIM lock would get acquired in turn during the EDT tree traversal process. If a path is specified for xpt_register_async(), it won't acquire and drop its own lock, but instead expects the caller to hold its own SIM lock. That works for the first part of xpt_register_async(), but causes a recursive lock acquisition once the EDT traversal happens and it comes to the SIM in question. And it isn't possible to call xpt_action() without holding a SIM lock. The locking changes fix this by using the XPT topology lock for EDT traversal, so it is no longer an issue to hold the SIM lock while calling xpt_register_async(). The solution for FreeBSD versions before the locking changes is to request notification of all device arrivals (so we pass a NULL path into xpt_register_async()) and then filter out the arrivals that are not ours. MFC After: 3 days Sponsored by: Spectra Logic Corporation
|
#
c503306d |
|
05-May-2014 |
Kenneth D. Merry <ken@FreeBSD.org> |
Adjust #if statements inside mprsas_send_smpcmd() to more accurately reflect when unmapped I/O support was added. For FreeBSD 10, it arrived just prior to __FreeBSD_version 1000028. For FreeBSD 9, it arrived just prior to __FreeBSD_version 902001. Also, fix compiler warnings in mprsas_send_smpcmd() that happen in the i386 PAE build for non-unmapped I/O builds. These were fixed in mps(4) in revision 241145, but didn't make it into the mpr(4) driver. This change should only affect FreeBSD versions outside the above revisions, and thus doesn't affect head. MFC after: 3 days Sponsored by: Spectra Logic Corporation
|
#
991554f2 |
|
02-May-2014 |
Kenneth D. Merry <ken@FreeBSD.org> |
Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers. This is derived from the mps(4) driver, but it supports only the 12Gb IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108. Some notes about this driver: o The 12Gb hardware can do "FastPath" I/O, and that capability is included in this driver. o WarpDrive functionality has been removed, since it isn't supported in the 12Gb driver interface. o The Scatter/Gather list handling code is significantly different between the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather lists. Thanks to LSI for developing and testing this driver for FreeBSD. share/man/man4/mpr.4: mpr(4) man page. sys/dev/mpr/*: mpr(4) driver files. sys/modules/Makefile, sys/modules/mpr/Makefile: Add a module Makefile for the mpr(4) driver. sys/conf/files: Add the mpr(4) driver. sys/amd64/conf/GENERIC, sys/i386/conf/GENERIC, sys/mips/conf/OCTEON1, sys/sparc64/conf/GENERIC: Add the mpr(4) driver to all config files that currently have the mps(4) driver. sys/ia64/conf/GENERIC: Add the mps(4) and mpr(4) drivers to the ia64 GENERIC config file. sys/i386/conf/XEN: Exclude the mpr module from building here. Submitted by: Steve McConnell <Stephen.McConnell@lsi.com> MFC after: 3 days Tested by: Chris Reeves <chrisr@spectralogic.com> Sponsored by: LSI, Spectra Logic Relnotes: LSI 12Gb SAS driver mpr(4) added
|