#
0098c55e |
|
01-Apr-2024 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Modify the deadline for ata_wait_after_reset() We found that the second parameter of function ata_wait_after_reset() is incorrectly used. We call smp_ata_check_ready_type() to poll the device type until the 30s timeout, so the correct deadline should be (jiffies + 30000). Fixes: 3c2673a09cf1 ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset") Co-developed-by: xiabing <xiabing12@h-partners.com> Signed-off-by: xiabing <xiabing12@h-partners.com> Co-developed-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/20240402035513.2024241-3-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f9242f16 |
|
21-Jan-2024 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remove hisi_hba->timer for v3 hw hisi_hba->timer is not used for v3 hw but there are two places that some operations related to hisi_hba->timer are called by v3 hw: - Deleting the timer in function hisi_sas_v3_hw() which is only for v3 hw; - Deleting the timer in function hisi_sas_controller_reset_prepare() which is common for v1/v2/v3 hw. We can remove the timer in the first case, but for the second scenario we need to remove it only for v3 hw, so check hw->sht which is NULL only for v3 hw before deleting hisi_hba->timer. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-5-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
69097a63 |
|
21-Jan-2024 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Check whether debugfs is enabled before removing or releasing it hisi_sas debugfs remove should be executed only when debugfs is enabled. Check whether debugfs is enabled and then remove it only if enabled. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3f030550 |
|
21-Jan-2024 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Remove redundant checks for automatic debugfs dump In commit 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger"), the memory allocation time of the DFX is changed from device initialization to dump occurs, so .debugfs_itct is not a valid address and do not need to check. The parameter hisi_sas_debugfs_enable is enough to check whether automatic debugfs dump is triggered, so remove redunant checks. Fixes: 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3c4f53b2 |
|
21-Jan-2024 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Fix a deadlock issue related to automatic dump If we issue a disabling PHY command, the device attached with it will go offline, if a 2 bit ECC error occurs at the same time, a hung task may be found: [ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas] [ 4613.691518] Call trace: [ 4613.694678] __switch_to+0xf8/0x17c [ 4613.698872] __schedule+0x660/0xee0 [ 4613.703063] schedule+0xac/0x240 [ 4613.706994] schedule_timeout+0x500/0x610 [ 4613.711705] __down+0x128/0x36c [ 4613.715548] down+0x240/0x2d0 [ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main] [ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas] [ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas] [ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main] [ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main] [ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas] [ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas] [ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas] [ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas] [ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas] [ 4613.784716] process_one_work+0x248/0x950 [ 4613.789424] worker_thread+0x318/0x934 [ 4613.793878] kthread+0x190/0x200 [ 4613.797810] ret_from_fork+0x10/0x18 [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main] [ 4613.841491] Call trace: [ 4613.844647] __switch_to+0xf8/0x17c [ 4613.848852] __schedule+0x660/0xee0 [ 4613.853052] schedule+0xac/0x240 [ 4613.856984] schedule_timeout+0x500/0x610 [ 4613.861695] __down+0x128/0x36c [ 4613.865542] down+0x240/0x2d0 [ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main] [ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main] [ 4613.883019] process_one_work+0x248/0x950 [ 4613.887732] worker_thread+0x318/0x934 [ 4613.892204] kthread+0x190/0x200 [ 4613.896118] ret_from_fork+0x10/0x18 [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas] [ 4613.939549] Call trace: [ 4613.942702] __switch_to+0xf8/0x17c [ 4613.946892] __schedule+0x660/0xee0 [ 4613.951083] schedule+0xac/0x240 [ 4613.955015] schedule_timeout+0x500/0x610 [ 4613.959725] wait_for_common+0x200/0x610 [ 4613.964349] wait_for_completion+0x3c/0x5c [ 4613.969146] flush_workqueue+0x198/0x790 [ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas] [ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas] [ 4613.985708] process_one_work+0x248/0x950 [ 4613.990420] worker_thread+0x318/0x934 [ 4613.994868] kthread+0x190/0x200 [ 4613.998800] ret_from_fork+0x10/0x18 This is because when the device goes offline, we obtain the hisi_hba semaphore and send the ABORT_DEV command to the device. However, the internal abort timed out due to the 2 bit ECC error and triggers automatic dump. In addition, since the hisi_hba semaphore has been obtained, the dump cannot be executed and the controller cannot be reset. Therefore, the deadlocks occur on the following circular dependencies: hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ... -> hisi_sas_internal_abort_timeout() -> down(). The deadlock is triggered only when the timeout occurs during device goes offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario where a device goes offline from other scenarios. Fixes: 2ff07b5c6fe9 ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8dd10296 |
|
13-Dec-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Check before using pointer variables In commit 4b329abc9180 ("scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()"), we move the variables slot to the function head. However, the variable slot may be NULL, we should check it in each branch. Fixes: 4b329abc9180 ("scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1702525516-51258-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d34ee535 |
|
13-Dec-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Replace with standard error code return value In function hisi_sas_controller_prereset(), -ENOSYS (Function not implemented) should be returned if the driver does not support .soft_reset. Returns -EPERM (Operation not permitted) if HISI_SAS_RESETTING_BIT is already be set. In function _suspend_v3_hw(), returns -EPERM (Operation not permitted) if HISI_SAS_RESETTING_BIT is already be set. Fixes: 4522204ab218 ("scsi: hisi_sas: tidy host controller reset function a bit") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1702525516-51258-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2ff07b5c |
|
12-Sep-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Directly call register snapshot instead of using workqueue Currently, register information dump is performed via workqueue, regardless of the trigger mode (automatic or manual). There is a delay in dumping register through workqueue, the exact register information at trigger time cannot be obtained. Call register snapshot directly instead of through a workqueue. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1694571327-78697-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a76f1b63 |
|
31-Jul-2023 |
Hannes Reinecke <hare@suse.de> |
ata,scsi: cleanup __ata_port_probe() Rename __ata_port_probe() to ata_port_probe() and drop the wrapper ata_sas_async_probe(). Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
#
1136a022 |
|
15-Aug-2023 |
John Garry <john.g.garry@oracle.com> |
scsi: libsas: Delete struct scsi_core Since commit 79855d178557 ("libsas: remove task_collector mode"), struct scsi_core only contains a reference to the shost. struct scsi_core is only used in sas_ha_struct.core, so delete scsi_core and replace with a reference to the shost there. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230815115156.343535-5-john.g.garry@oracle.com Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2f4e20cd |
|
15-Aug-2023 |
John Garry <john.g.garry@oracle.com> |
scsi: libsas: Delete enum sas_phy_type enum sas_phy_type is used for asd_sas_phy.type, which is only ever set, so delete this member and the enum. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230815115156.343535-4-john.g.garry@oracle.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c46a9170 |
|
15-Aug-2023 |
John Garry <john.g.garry@oracle.com> |
scsi: libsas: Delete enum sas_class enum sas_class prob would have been useful if function sas_show_class() was ever implemented, which it wasn't. enum sas_class is used as asd_sas_port.class and asd_sas_phy.class, which are only ever set, so delete these members and the enum. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230815115156.343535-3-john.g.garry@oracle.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b1bc4973 |
|
15-Aug-2023 |
John Garry <john.g.garry@oracle.com> |
scsi: libsas: Delete sas_ha_struct.lldd_module Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ha_struct.lldd_module has only ever been set, so remove it. Struct scsi_host_template already has a reference to the LLD driver module as to stop the driver being removed unexpectedly. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230815115156.343535-2-john.g.garry@oracle.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
29f45ed1 |
|
10-Jul-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Delete unused lock in hisi_sas_port_notify_formed() Currently spinlock hisi_hba->lock is used by both interrupts and threads which requires the use of spin_lock_irqsave()/spin_unlock_irqrestore(). However, some places still use spin_lock()/spin_unlock(). Reviewing the code revealed that it is unnecessary to use hisi_hba->lock in the function hisi_sas_port_notify_formed() which is the only place that uses the spinlock in interrupt context. So delete unused lock in hisi_sas_port_notify_formed(). Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1689045300-44318-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8cd6d0a3 |
|
18-May-2023 |
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> |
scsi: hisi_sas: Convert to platform remove callback returning void The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. hisi_sas_remove() returned zero unconditionally so this was changed to return void. Then it has the right prototype to be used directly as remove callback for the two hisi_sas drivers. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230518202043.261739-1-u.kleine-koenig@pengutronix.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
89954f02 |
|
19-Mar-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Ensure all enabled PHYs up during controller reset For the controller reset operation, hisi_sas_phy_enable() is executed for each enabled local PHY, and refresh the port id of each device based on the latest hisi_sas_phy->port_id after 1 second sleep, hisi_sas_phy->port_id is configured in the interrupt processing function phy_up_v3_hw(). However, in directly attached scenario, for some SATA disks the amount of time for phyup more than 1s sometimes. In this case, incorrect port id may be configured in hisi_sas_refresh_port_id(). As a result, all the internal IOs fail and disk lost, such as follows: [10717.666565] hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=10(sata) [10718.826813] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=63 task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc8003f 0x0 0x0 Error info: 0x0 0x0 0x0 0x0 [10718.843428] sas: TMF task open reject failed 5000000000000501 [10718.849242] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=64 task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc80040 0x0 0x0 Error info: 0x0 0x0 0x0 0x0 [10718.865856] sas: TMF task open reject failed 5000000000000501 [10718.871670] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=65 task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc80041 0x0 0x0 Error info: 0x0 0x0 0x0 0x0 [10718.888284] sas: TMF task open reject failed 5000000000000501 [10718.894093] sas: executing TMF for 5000000000000501 failed after 3 attempts! [10718.901114] hisi_sas_v3_hw 0000:74:02.0: ata disk 5000000000000501 reset failed [10718.908410] hisi_sas_v3_hw 0000:74:02.0: controller reset complete ..... [10773.298633] ata216.00: revalidation failed (errno=-19) [10773.303753] ata216.00: disable device So the time of waitting for PHYs up is 1s which may be not enough. To solve the issue, running hisi_sas_phy_enable() in parallel through async operations and use wait_for_completion_timeout() to wait for PHYs come up instead of directly sleep for 1 second. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1679283265-115066-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
71fb36b5 |
|
19-Mar-2023 |
Xingui Yang <yangxingui@huawei.com> |
scsi: hisi_sas: Grab sas_dev lock when traversing the members of sas_dev.list When freeing slots in function slot_complete_v3_hw(), it is possible that sas_dev.list is being traversed elsewhere, and it may trigger a NULL pointer exception, such as follows: ==>cq thread ==>scsi_eh_6 ==>scsi_error_handler() ==>sas_eh_handle_sas_errors() ==>sas_scsi_find_task() ==>lldd_abort_task() ==>slot_complete_v3_hw() ==>hisi_sas_abort_task() ==>hisi_sas_slot_task_free() ==>dereg_device_v3_hw() ==>list_del_init() ==>list_for_each_entry_safe() [ 7165.434918] sas: Enter sas_scsi_recover_host busy: 32 failed: 32 [ 7165.434926] sas: trying to find task 0x00000000769b5ba5 [ 7165.434927] sas: sas_scsi_find_task: aborting task 0x00000000769b5ba5 [ 7165.434940] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000769b5ba5) aborted [ 7165.434964] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000c9f7aa07) ignored [ 7165.434965] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000e2a1cf01) ignored [ 7165.434968] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 7165.434972] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(0000000022d52d93) ignored [ 7165.434975] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(0000000066a7516c) ignored [ 7165.434976] Mem abort info: [ 7165.434982] ESR = 0x96000004 [ 7165.434991] Exception class = DABT (current EL), IL = 32 bits [ 7165.434992] SET = 0, FnV = 0 [ 7165.434993] EA = 0, S1PTW = 0 [ 7165.434994] Data abort info: [ 7165.434994] ISV = 0, ISS = 0x00000004 [ 7165.434995] CM = 0, WnR = 0 [ 7165.434997] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000f29543f2 [ 7165.434998] [0000000000000000] pgd=0000000000000000 [ 7165.435003] Internal error: Oops: 96000004 [#1] SMP [ 7165.439863] Process scsi_eh_6 (pid: 4109, stack limit = 0x00000000c43818d5) [ 7165.468862] pstate: 00c00009 (nzcv daif +PAN +UAO) [ 7165.473637] pc : dereg_device_v3_hw+0x68/0xa8 [hisi_sas_v3_hw] [ 7165.479443] lr : dereg_device_v3_hw+0x2c/0xa8 [hisi_sas_v3_hw] [ 7165.485247] sp : ffff00001d623bc0 [ 7165.488546] x29: ffff00001d623bc0 x28: ffffa027d03b9508 [ 7165.493835] x27: ffff80278ed50af0 x26: ffffa027dd31e0a8 [ 7165.499123] x25: ffffa027d9b27f88 x24: ffffa027d9b209f8 [ 7165.504411] x23: ffffa027c45b0d60 x22: ffff80278ec07c00 [ 7165.509700] x21: 0000000000000008 x20: ffffa027d9b209f8 [ 7165.514988] x19: ffffa027d9b27f88 x18: ffffffffffffffff [ 7165.520276] x17: 0000000000000000 x16: 0000000000000000 [ 7165.525564] x15: ffff0000091d9708 x14: ffff0000093b7dc8 [ 7165.530852] x13: ffff0000093b7a23 x12: 6e7265746e692067 [ 7165.536140] x11: 0000000000000000 x10: 0000000000000bb0 [ 7165.541429] x9 : ffff00001d6238f0 x8 : ffffa027d877af00 [ 7165.546718] x7 : ffffa027d6329600 x6 : ffff7e809f58ca00 [ 7165.552006] x5 : 0000000000001f8a x4 : 000000000000088e [ 7165.557295] x3 : ffffa027d9b27fa8 x2 : 0000000000000000 [ 7165.562583] x1 : 0000000000000000 x0 : 000000003000188e [ 7165.567872] Call trace: [ 7165.570309] dereg_device_v3_hw+0x68/0xa8 [hisi_sas_v3_hw] [ 7165.575775] hisi_sas_abort_task+0x248/0x358 [hisi_sas_main] [ 7165.581415] sas_eh_handle_sas_errors+0x258/0x8e0 [libsas] [ 7165.586876] sas_scsi_recover_host+0x134/0x458 [libsas] [ 7165.592082] scsi_error_handler+0xb4/0x488 [ 7165.596163] kthread+0x134/0x138 [ 7165.599380] ret_from_fork+0x10/0x18 [ 7165.602940] Code: d5033e9f b9000040 aa0103e2 eb03003f (f9400021) [ 7165.609004] kernel fault(0x1) notification starting on CPU 75 [ 7165.700728] ---[ end trace fc042cbbea224efc ]--- [ 7165.705326] Kernel panic - not syncing: Fatal exception To fix the issue, grab sas_dev lock when traversing the members of sas_dev.list in dereg_device_v3_hw() and hisi_sas_release_tasks() to avoid concurrency of adding and deleting member. When function hisi_sas_release_tasks() calls hisi_sas_do_release_task() to free slot, the lock cannot be grabbed again in hisi_sas_slot_task_free(), then a bool parameter need_lock is added. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1679283265-115066-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b711ef5e |
|
06-Mar-2023 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Sync complete queue for poll queue Currently we sync irq to avoid freeing task before using task in I/O completion. After adding io_uring support, we need to do something similar for poll queues. As the process of CQ entries on poll queue are protected by spinlock cq->lock, we can use spin_lock() + spin_unlock() on cq->lock to make sure that CQ entries are processed to completion and then the complete queue is synced. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-4-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0e47effa |
|
06-Mar-2023 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Add poll support for v3 hw Add a module parameter to set how many queues are used for iopoll. Also fill the interface mq_poll. For internal I/Os from libsas and libata we use non-iopoll queue (queue 0) to deliver and complete them. But for internal abort I/Os, just don't send them for poll queues. There is still a risk associated as this sends internal abort commands to non-iopoll queues which actually requires sending an internal abort command to every queue. As a result, make the module parameter as "experimental" for now. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f58c8970 |
|
03-Jan-2023 |
Yihang Li <liyihang9@huawei.com> |
scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id Currently the driver sets the port invalid if one phy in the port is not enabled, which may cause issues in expander situation. In directly attached situation, if phy up doesn't occur in time when refreshing port id, the port is incorrectly set to invalid which will also cause disk lost. Therefore set a port invalid only if there are no devices attached to the port. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1672805000-141102-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
037b4805 |
|
03-Jan-2023 |
Xingui Yang <yangxingui@huawei.com> |
scsi: hisi_sas: Use abort task set to reset SAS disks when discovered Currently clear task set is used to abort all commands remaining in the disk when the SAS disk is discovered, and if the disk is discovered by two initiators, other I_T nexuses are also affected. So use abort task set instead and take effect only on the specified I_T nexus. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1672805000-141102-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ea44242b |
|
14-Dec-2022 |
Jason Yan <yanaijie@huawei.com> |
scsi: hisi_sas: Fix tag freeing for reserved tags The reserved tags were put in the lower region of the tagset in commit f7d190a94e35 ("scsi: hisi_sas: Put reserved tags in lower region of tagset"). However, only the allocate function was changed, freeing was not handled. This resulted in a failure to boot: [ 33.467345] hisi_sas_v3_hw 0000:b4:02.0: task exec: failed[-132]! [ 33.473413] sas: Executing internal abort failed 5000000000000603 (-132) [ 33.480088] hisi_sas_v3_hw 0000:b4:02.0: I_T nexus reset: internal abort (-132) [ 33.657336] hisi_sas_v3_hw 0000:b4:02.0: task exec: failed[-132]! [ 33.663403] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x40) [ 35.787344] hisi_sas_v3_hw 0000:b4:04.0: task exec: failed[-132]! [ 35.793411] sas: Executing internal abort failed 5000000000000703 (-132) [ 35.800084] hisi_sas_v3_hw 0000:b4:04.0: I_T nexus reset: internal abort (-132) [ 35.977335] hisi_sas_v3_hw 0000:b4:04.0: task exec: failed[-132]! [ 35.983403] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x40) [ 35.989643] ata10.00: revalidation failed (errno=-5) Fixes: f7d190a94e35 ("scsi: hisi_sas: Put reserved tags in lower region of tagset") Cc: John Garry <john.g.garry@oracle.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Acked-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3c2673a0 |
|
18-Nov-2022 |
Jie Zhan <zhanjie9@hisilicon.com> |
scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset SATA devices on an expander may be removed and not be found again when I_T nexus reset and revalidation are processed simultaneously. The issue comes from: - Revalidation can remove SATA devices in link reset, e.g. in hisi_sas_clear_nexus_ha(). - However, hisi_sas_debug_I_T_nexus_reset() polls the state of a SATA device on an expander after sending link_reset, where it calls: hisi_sas_debug_I_T_nexus_reset sas_ata_wait_after_reset ata_wait_after_reset ata_wait_ready smp_ata_check_ready sas_ex_phy_discover sas_ex_phy_discover_helper sas_set_ex_phy The ex_phy's change count is updated in sas_set_ex_phy(), so SATA devices after a link reset may not be found later through revalidation. A similar issue was reported in: commit 0f3fce5cc77e ("[SCSI] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready") commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling"). To address this issue, in hisi_sas_debug_I_T_nexus_reset(), we now call smp_ata_check_ready_type() that only polls the device type while not updating the ex_phy's data of libsas. Fixes: 71453bd9d1bf ("scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-5-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
94a3555d |
|
18-Nov-2022 |
Jie Zhan <zhanjie9@hisilicon.com> |
scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset" This reverts commit f5f2a2716055ad8c0c4ff83e51d667646c6c5d8a. This is now unnecessary to solve the SATA devices missing issue in hisi_sas_clear_nexus_ha(). Hence, we should not ignore bcast events during sas_eh_handle_sas_errors() in case of missing bcast events, unless a justified need is found and a mechanism to defer (but not ignore) bcast events in sas_eh_handle_sas_errors() is provided. Also, in hisi_sas_clear_nexus_ha(), there is nothing further to handle in "out: " other than return, so that part can be reverted. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-3-zhanjie9@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7e613be7 |
|
18-Nov-2022 |
Jie Zhan <zhanjie9@hisilicon.com> |
scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()" This reverts commit 11ff0c98fca35df16c84d4eee52008faecaf10a6. Draining or flushing events in hisi_sas_rescan_topology() can hang the driver, typically with phy up or phy down events being processed, i.e. sas_porte_bytes_dmaed() or sas_phye_loss_of_signal(). Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-2-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f7d190a9 |
|
18-Oct-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Put reserved tags in lower region of tagset To be consistent with blk-mq, put the reserved tags in the lower region of the tagset. Eventually we hope to get rid of all this reserved tag management. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-4-git-send-email-john.garry@huawei.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
295fd233 |
|
18-Oct-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Use sas_task_find_rq() Use sas_task_find_rq() to lookup the request per task for its driver tag. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f0ed7bd5 |
|
28-Sep-2022 |
Jason Yan <yanaijie@huawei.com> |
scsi: hisi_sas: Use sas_find_attathed_phy_id() instead of open coding it The attached phy finding is open coded. Replace it with sas_find_attached_phy_id(). To keep things consistent, the return value of hisi_sas_dev_found() is also changed to -ENODEV after calling sas_find_attathed_phy_id() failed. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-6-yanaijie@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
930d97da |
|
17-Oct-2022 |
Xingui Yang <yangxingui@huawei.com> |
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw When CQ header dw3 SATA_DISK_ERR is set it means this SATA disk is in error state and the current IPTT is invalid. An invalid IPTT does not correspond to any slot. In this scenario, new I/Os that delivered to disk will be rejected by the controller and all I/Os remaining in the disk should be aborted, which we add here with the sas_ata_device_link_abort() call. In hisi_sas_abort_task() we don't want to issue a soft reset as it may cause info to be lost in the target disk for the ATA EH autopsy. In this case, just release resources - the disk won't return other I/Os normally after NCQ Error, so this is safe. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-4-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4b329abc |
|
17-Oct-2022 |
Xingui Yang <yangxingui@huawei.com> |
scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task() Each branch currently defines a slot variable independently, and it is neater to move it to the function head. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-3-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f5f2a271 |
|
05-Sep-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset Remote devices may go missing from the per-device nexus reset part of the HA nexus, i.e after the controller reset. This is because libsas may find the devices to be gone as the phy may be temporarily down when processing the bcast event generated from the nexus reset. Filter out bcast events during this time to stop the devices being lost. Link: https://lore.kernel.org/r/1662378529-101489-6-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e9b6bada |
|
05-Sep-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Add helper to process bcast events Add a helper for bcast processing to reduce duplication. Link: https://lore.kernel.org/r/1662378529-101489-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
11ff0c98 |
|
05-Sep-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology() In resetting the controller, SATA devices may be lost. The issue is that when we insert the bcast events to rescan the topology in hisi_sas_rescan_topology(), when we subsequently nexus reset the SATA devices in hisi_sas_async_I_T_nexus_reset(), there is a small timing window in which the remote phy is down and we process the bcast event (meaning that libsas judges that the disk is lost). Ensure that all bcast events are processed prior to the nexus reset to close this window. Link: https://lore.kernel.org/r/1662378529-101489-4-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bc555115 |
|
05-Sep-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Clear HISI_SAS_HW_FAULT_BIT earlier Once the controller HW has been reset then we can unset flag HISI_SAS_HW_FAULT_BIT. In clearing this flag earlier we can now successfully execute commands in hisi_sas_controller_reset_done(), like bcast processing. Link: https://lore.kernel.org/r/1662378529-101489-3-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f0902095 |
|
14-Jul-2022 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Relocate DMA unmap of SMP task Currently SMP tasks are DMA unmapped only when cq of SMP I/O is returned normally. If the cq of SMP I/O is returned with exception actually SMP TAS is never unmapped. Relocate DMA unmap of SMP task to fix the issue. Link: https://lore.kernel.org/r/1657823002-139010-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bc22f9c0 |
|
14-Jul-2022 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remove unnecessary variable to hold DMA map elements Use slot->n_elem to store the return value of dma_map_sg() for SSP and SMP IOs, and remove unnecessary variable n_elem_req. Link: https://lore.kernel.org/r/1657823002-139010-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6c6ac8b7 |
|
17-May-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix memory ordering in hisi_sas_task_deliver() The memories for the slot should be observed to be written prior to observing the slot as ready. Prior to commit 26fc0ea74fcb ("scsi: libsas: Drop SAS_TASK_AT_INITIATOR"), we had a spin_lock() + spin_unlock() immediately before marking the slot as ready. The spin_unlock() - with release semantics - caused the slot memory to be observed to be written. Now that the spin_lock() + spin_unlock() is gone, use a smp_wmb(). Link: https://lore.kernel.org/r/1652774661-12935-1-git-send-email-john.garry@huawei.com Fixes: 26fc0ea74fcb ("scsi: libsas: Drop SAS_TASK_AT_INITIATOR") Reported-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e9dedc13 |
|
12-May-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix rescan after deleting a disk Removing an ATA device via sysfs means that the device may not be found through re-scanning: root@ubuntu:/home/john# lsscsi [0:0:0:0] disk SanDisk LT0200MO P404 /dev/sda [0:0:1:0] disk ATA HGST HUS724040AL A8B0 /dev/sdb [0:0:8:0] enclosu 12G SAS Expander RevB - root@ubuntu:/home/john# echo 1 > /sys/block/sdb/device/delete root@ubuntu:/home/john# echo "- - -" > /sys/class/scsi_host/host0/scan root@ubuntu:/home/john# lsscsi [0:0:0:0] disk SanDisk LT0200MO P404 /dev/sda [0:0:8:0] enclosu 12G SAS Expander RevB - root@ubuntu:/home/john# The problem is that the rescan of the device may conflict with the device in being re-initialized, as follows: - In the rescan we call hisi_sas_slave_alloc() in store_scan() -> sas_user_scan() -> [__]scsi_scan_target() -> scsi_probe_and_add_lunc() -> scsi_alloc_sdev() -> hisi_sas_slave_alloc() -> hisi_sas_init_device() In hisi_sas_init_device() we issue an IT nexus reset for ATA devices - That IT nexus causes the remote PHY to go down and this triggers a bcast event - In parallel libsas processes the bcast event, finds that the phy is down and marks the device as gone The hard reset issued in hisi_sas_init_device() is unncessary - as described in the code comment - so remove it. Also set dev status as HISI_SAS_DEV_NORMAL as the hisi_sas_init_device() call. Link: https://lore.kernel.org/r/1652354134-171343-4-git-send-email-john.garry@huawei.com Fixes: 36c6b7613ef1 ("scsi: hisi_sas: Initialise devices in .slave_alloc callback") Tested-by: Yihang Li <liyihang6@hisilicon.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
71453bd9 |
|
12-May-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset We have seen errors like this when a SATA device is probed: [524.566298] hisi_sas_v3_hw 0000L74:02.0: erroneous completion iptt=4096 ... [524.582827] sas: TMF task open reject failed 500e004aaaaaaaa00 Since commit 21c7e972475e ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure"), we issue an ATA softreset to disks after a phy reset to ensure that they are in sound working order. If the softreset is issued before the remote phy has come back up then the softreset will fail (errors as above). Remedy this by waiting for the phy to come back up after the reset. Link: https://lore.kernel.org/r/1652354134-171343-3-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
066f4c31 |
|
17-Mar-2022 |
Dan Carpenter <dan.carpenter@oracle.com> |
scsi: hisi_sas: Remove stray fallthrough annotation This case statement doesn't fall through any more so remove the fallthrough annotation. Link: https://lore.kernel.org/r/20220317075214.GC25237@kili Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
095478a6 |
|
11-Mar-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Use libsas internal abort support Use the common libsas internal abort functionality. In addition, this driver has special handling for internal abort timeouts - specifically whether to reset the controller in that instance, so extend the API for that. Timeout is now increased to 20 * Hz from 6 * Hz. We also retry for failure now, but this should not make a difference. Link: https://lore.kernel.org/r/1647001432-239276-5-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
512623de |
|
24-Feb-2022 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Change hisi_sas_control_phy() phyup timeout The time of phyup not only depends on the controller but also the type of disk connected. As an example, from experience, for some SATA disks the amount of time from reset/power-on to receive the D2H FIS for phyup can take upto and more than 10s sometimes. According to the specification of some SATA disks such as ST14000NM0018, the max time from power-on to ready is 30s. Based on this the current timeout of phyup at 2s which is not enough. So set the value as HISI_SAS_WAIT_PHYUP_TIMEOUT (30s) in hisi_sas_control_phy(). For v3 hw there is a pre-existing workaround for a HW bug, being that we issue a link reset when the OOB occurs but the phyup does not. The current phyup timeout is HISI_SAS_WAIT_PHYUP_TIMEOUT. So if this does occur from when issuing a phy enable or similar via hisi_sas_control_phy(), the subsequent HW workaround linkreset processing calls hisi_sas_control_phy(), but this will pend the original phy reset timing out, so it is safe. Link: https://lore.kernel.org/r/1645703489-87194-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3f2e252e |
|
22-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_execute_ata_cmd() Add a function to execute an ATA command using the TMF code, and use in the hisi_sas driver. That driver needs to be able to issue the command on a specific phy, so add an interface for that. With that, hisi_sas_exec_internal_tmf_task() may be deleted. Link: https://lore.kernel.org/r/1645534259-27068-19-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4fea759e |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_abort_task() Add a generic implementation of abort task TMF handler, and use in LLDDs. With that, some LLDDs custom TMF functions can now be deleted. Link: https://lore.kernel.org/r/1645112566-115804-18-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
72f8810e |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_query_task() Add a generic implementation of query task TMF handler, and use in LLDDs. Link: https://lore.kernel.org/r/1645112566-115804-17-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
29d77690 |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_lu_reset() Add a generic implementation of LU reset TMF handler, and use in LLDDs. Link: https://lore.kernel.org/r/1645112566-115804-16-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e8585452 |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_clear_task_set() Add a generic implementation of clear task set TMF handler, and use in LLDDs. Link: https://lore.kernel.org/r/1645112566-115804-15-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
69b80a0e |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_abort_task_set() Add a generic implementation of abort task set TMF handler, and use in LLDDs. Link: https://lore.kernel.org/r/1645112566-115804-14-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
693e66a0 |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add TMF handler aborted callback The hisi_sas and pm8001 TMF handlers have some special processing for when the TMF is aborted, so add a callback and fill it in for those drivers. Link: https://lore.kernel.org/r/1645112566-115804-13-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
96e54376 |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add sas_task.tmf Add a pointer to a sas_tmf_task to the sas_task struct, as this will be used when the common LLDD TMF code is factored out. Also set it for the LLDDs to store per-sas_task TMF info. Link: https://lore.kernel.org/r/1645112566-115804-9-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bbfe82cd |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Add struct sas_tmf_task Some of the LLDDs which use libsas have their own definition of a struct to hold TMF info, so add a common struct for libsas. Also add an interim force phy id field for hisi_sas driver, which will be removed once the STP "TMF" code is factored out. Even though some LLDDs (pm8001) use a u32 for the tag, u16 will be adequate, as that named driver only uses tags in range [0, 1024). Link: https://lore.kernel.org/r/1645112566-115804-8-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
da19eaba |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Delete unused I_T_NEXUS_RESET_PHYUP_TIMEOUT There is no user, so delete it. Link: https://lore.kernel.org/r/1645112566-115804-6-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
25882c82 |
|
17-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Delete lldd_clear_aca callback This callback is never called, so remove support. Link: https://lore.kernel.org/r/1645112566-115804-4-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
26fc0ea7 |
|
10-Feb-2022 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Drop SAS_TASK_AT_INITIATOR This flag is now only ever set, so delete it. This also avoids a use-after-free in the pm8001 queue path, as reported in the following: https://lore.kernel.org/linux-scsi/c3cb7228-254e-9584-182b-007ac5e6fe0a@huawei.com/T/#m28c94c6d3ff582ec4a9fa54819180740e8bd4cfb https://lore.kernel.org/linux-scsi/0cc0c435-b4f2-9c76-258d-865ba50a29dd@huawei.com/ [mkp: checkpatch + two SAS_TASK_AT_INITIATOR references] Link: https://lore.kernel.org/r/1644489804-85730-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c763ec4c |
|
31-Jan-2022 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix setting of hisi_sas_slot.is_internal The hisi_sas_slot.is_internal member is not set properly for ATA commands which the driver sends directly. A TMF struct pointer is normally used as a test to set this, but it is NULL for those commands. It's not ideal, but pass an empty TMF struct to set that member properly. Link: https://lore.kernel.org/r/1643627607-138785-1-git-send-email-john.garry@huawei.com Fixes: dc313f6b125b ("scsi: hisi_sas: Factor out task prep and delivery code") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8001fa24 |
|
15-Jan-2022 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
scsi: hisi_sas: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/1bf2d3660178b0e6f172e5208bc0bd68d31d9268.1642237482.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5d9224fb |
|
04-Jan-2022 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remove unused variable and check in hisi_sas_send_ata_reset_each_phy() In commit 29e2bac87421 ("scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list"), we use asd_sas_port->phy_mask instead of accessing asd_sas_port->phy_list, and it is enough to use asd_sas_port->phy_mask to check the state of phy, so remove the unused check and variable. Link: https://lore.kernel.org/r/1641300126-53574-1-git-send-email-chenxiang66@hisilicon.com Fixes: 29e2bac87421 ("scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list") Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Colin King <colin.i.king@gmail.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ae9b69e8 |
|
20-Dec-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Keep controller active between ISR of phyup and the event being processed It is possible that controller may become suspended between processing a phyup interrupt and the event being processed by libsas. As such, we can't ensure the controller is active when processing the phyup event - this may cause the phyup event to be lost or other issues. To avoid any possible issues, add pm_runtime_get_noresume() in phyup interrupt handler and pm_runtime_put_sync() in the work handler exit to ensure that we stay always active. Since we only want to call pm_runtime_get_noresume() for v3 hw, signal this will a new event, HISI_PHYE_PHY_UP_PM. Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
29e2bac8 |
|
20-Dec-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list Most places that use asd_sas_port->phy_list are protected by spinlock asd_sas_port->phy_list_lock, however there are still some places which miss grabbing the lock. Add it in function hisi_sas_refresh_port_id() when accessing asd_sas_port->phy_list. This carries a risk that list mutates while at the same time dropping the lock in function hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of accessing asd_sas_port->phy_list to avoid this risk. Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6cc73908 |
|
20-Dec-2021 |
John Garry <john.garry@huawei.com> |
scsi: Revert "scsi: hisi_sas: Filter out new PHY up events during suspend" This reverts commit b14a37e011d829404c29a5ae17849d7efb034893. In that commit, we had to filter out phy-up events during suspend, as it work cause a deadlock between processing the phyup event and the resume HA function try to drain the HA event workqueue to complete the resume process. Now that we no longer try to drain the HA event queue during the HA resume processor, the deadlock would not occur, so remove the special handling for it. Link: https://lore.kernel.org/r/1639999298-244569-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
37310bad |
|
15-Dec-2021 |
Qi Liu <liuqi115@huawei.com> |
scsi: hisi_sas: Fix phyup timeout on FPGA The OOB interrupt and phyup interrupt handlers may run out-of-order in high CPU usage scenarios. Since the hisi_sas_phy.timer is added in hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this out-of-order execution will cause hisi_sas_phy.timer timeout to trigger. To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure that the timer won't be added after phyup handler completes. Link: https://lore.kernel.org/r/1639579061-179473-8-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
16775db6 |
|
15-Dec-2021 |
Qi Liu <liuqi115@huawei.com> |
scsi: hisi_sas: Prevent parallel FLR and controller reset If we issue a controller reset command during executing a FLR a hung task may be found: Call trace: __switch_to+0x158/0x1cc __schedule+0x2e8/0x85c schedule+0x7c/0x110 schedule_timeout+0x190/0x1cc __down+0x7c/0xd4 down+0x5c/0x7c hisi_sas_task_exec+0x510/0x680 [hisi_sas_main] hisi_sas_queue_command+0x24/0x30 [hisi_sas_main] smp_execute_task_sg+0xf4/0x23c [libsas] sas_smp_phy_control+0x110/0x1e0 [libsas] transport_sas_phy_reset+0xc8/0x190 [libsas] phy_reset_work+0x2c/0x40 [libsas] process_one_work+0x1dc/0x48c worker_thread+0x15c/0x464 kthread+0x160/0x170 ret_from_fork+0x10/0x18 This is a race condition which occurs when the FLR completes first. Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held, so now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem . Link: https://lore.kernel.org/r/1639579061-179473-7-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
20c63493 |
|
15-Dec-2021 |
Qi Liu <liuqi115@huawei.com> |
scsi: hisi_sas: Prevent parallel controller reset and control phy command A user may issue a control phy command from sysfs at any time, even if the controller is resetting. If a phy is disabled by hardreset/linkreset command before calling get_phys_state() in the reset path, the saved phy state may be incorrect. To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure that the controller reset may not run at the same time as when the phy control function is running. Link: https://lore.kernel.org/r/1639579061-179473-6-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
dc313f6b |
|
15-Dec-2021 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Factor out task prep and delivery code The task prep code is the same between the normal path (in hisi_sas_task_prep()) and the internal abort path, so factor is out into a common function. Link: https://lore.kernel.org/r/1639579061-179473-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
08c61b5d |
|
15-Dec-2021 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Pass abort structure for internal abort To help factor out code in future, it's useful to know if we're executing an internal abort, so pass a pointer to the structure. The idea is that a NULL pointer means not an internal abort. Link: https://lore.kernel.org/r/1639579061-179473-4-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
934385a4 |
|
15-Dec-2021 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Make internal abort have no task proto For an internal abort, the task does not have a protocol, so set to none. This will make it easier to differentiate internal abort tasks in future. Link: https://lore.kernel.org/r/1639579061-179473-3-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0e462085 |
|
15-Dec-2021 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly Currently we start delivery of commands to the DQ after returning from hisi_sas_task_exec() with success. Let's just start delivery directly in that function without having to check if some local variable is set. Link: https://lore.kernel.org/r/1639579061-179473-2-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4d6942e2 |
|
26-Nov-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
scsi: hisi_sas: Use non-atomic bitmap functions when possible All uses of the 'hisi_hba->slot_index_tags' bitmap are protected with the 'hisi_hba->lock' spinlock. Prefer the non-atomic '__[set|clear]_bit()' functions to save a few cycles. Link: https://lore.kernel.org/r/8ee33e463523db080e6a2c06f332e47abb69359b.1637961191.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d43efddf |
|
26-Nov-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
scsi: hisi_sas: Remove some useless code in hisi_sas_alloc() The 'hisi_hba->slot_index_tags' bitmap is allocated with bitmap_zalloc() so it is already cleared. There is no need to clear it another time, one bit at a time. Remove the corresponding useless code. Link: https://lore.kernel.org/r/41c86e7e3e05a13bd586d8ee1b81296140b7a6eb.1637961191.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
54585ec6 |
|
26-Nov-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
scsi: hisi_sas: Use devm_bitmap_zalloc() when applicable 'hisi_hba->slot_index_tags' is a bitmap. Use 'devm_bitmap_zalloc()' to simplify code, improve the semantic, and avoid some open-coded arithmetic in allocator arguments. Link: https://lore.kernel.org/r/4afa3f71e66c941c660627c7f5b0223b51968ebb.1637961191.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
21c7e972 |
|
12-Oct-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure If the softreset fails in the I_T reset, libsas will then continue to issue a controller reset to try to recover. However a faulty disk may cause the softreset to fail, and resetting the controller will not help this scenario. Indeed, we will just continue the cycle of error handle handling to try to recover. So if the softreset fails upon certain conditions, just disable the phy associated with the disk. The user needs to handle this problem. Link: https://lore.kernel.org/r/1634041588-74824-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
046ab7d0 |
|
12-Oct-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Wait for phyup in hisi_sas_control_phy() When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy will be disabled and re-enabled for the directly attached scenario. It takes some time for the phy to come back up after re-enabling the phy. If the controller becomes suspended while waiting for the phy to come back, the phy up may be lost (along with the disk). To solve this problem, wait for the phy up to occur with a timeout. Indeed this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so just relocate the functionality to hisi_sas_control_phy(). Since the HA workqueue is drained when suspending the controller, and the phy control function is called from the same workqueue, we can guarantee that the controller will not be suspended during this period. Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
36c6b761 |
|
12-Oct-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Initialise devices in .slave_alloc callback Perform driver-specific SCSI device initialization in the designated SCSI midlayer callback instead of relying on the libsas "device found" callback. The SCSI midlayer .slave_alloc interface is called prior to sending any I/O to the device. Link: https://lore.kernel.org/r/1634041588-74824-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
080b4f97 |
|
24-Aug-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Replace del_timer() calls with del_timer_sync() Some usage of del_timer() in the driver is potentially unsafe. When running the sas_task->slow_task timer in hisi_sas_exec_internal_tmf_task(), execution may be blocked in function hisi_sas_task_exec(); so it is possible that the timer is running when the callback to disable the timer is running. This could be dangerous, as we immediately release resources which the timer callback uses after disabling the timer. The same situation may be found at other sites, such as _hisi_sas_internal_task_abort(). Change calls to del_timer() to del_timer_sync() as necessary, to ensure any timer has finished when disabling. Also remove calls to timer_pending() prior to del_timer() as it is not necessary. Link: https://lore.kernel.org/r/1629799260-120116-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b5a9fa20 |
|
24-Aug-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Rename HISI_SAS_{RESET -> RESETTING}_BIT HISI_SAS_RESET_BIT means that the controller is being reset, and so the name is a bit vague. Rename it to HISI_SAS_RESETTING_BIT. Link: https://lore.kernel.org/r/1629799260-120116-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1effbfac |
|
09-Aug-2021 |
Bart Van Assche <bvanassche@acm.org> |
scsi: hisi_sas: Use scsi_cmd_to_rq() instead of scsi_cmnd.request Prepare for removal of the request pointer by using scsi_cmd_to_rq() instead. This patch does not change any functionality. Link: https://lore.kernel.org/r/20210809230355.8186-22-bvanassche@acm.org Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e8a4d0da |
|
07-Jun-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Speed up error handling when internal abort timeout occurs If an internal task abort timeout occurs, the controller has developed a fault, and needs to be reset to be recovered. When this occurs during error handling, the current policy is to allow error handling to continue, and the inevitable nexus ha reset will handle the required reset. However various steps of error handling need to taken before this happens. These also involve some level of HW interaction, which will also fail with various timeouts. Speed up this process by recording a HW fault bit for an internal abort timeout - when this is set, just automatically error any HW interaction, and essentially go straight to clear nexus ha (to reset the controller). Link: https://lore.kernel.org/r/1623058179-80434-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
63ece9eb |
|
07-Jun-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Reset controller for internal abort timeout If an internal task abort timeout occurs, the controller has developed a fault, and needs to be reset to be recovered. However if a timeout occurs during SCSI error handling, issuing a controller reset immediately may conflict with the error handling. To handle internal abort in these two scenarios, only queue the reset when not in an error handling function. In the case of a timeout during error handling, do nothing and rely on the inevitable ha nexus reset to reset the controller. Link: https://lore.kernel.org/r/1623058179-80434-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2f12a499 |
|
07-Jun-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Include HZ in timer macros Include HZ in timer macros to make the code more concise. Link: https://lore.kernel.org/r/1623058179-80434-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0f757339 |
|
07-Jun-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Run I_T nexus resets in parallel for clear nexus reset For a clear nexus reset operation, the I_T nexus resets are executed serially for each device. For devices attached through an expander, this may take 2s per device; so, in total, could take a long time. Reduce the total time by running the I_T nexus resets in parallel through async operations. Link: https://lore.kernel.org/r/1623058179-80434-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
366da0da |
|
07-Jun-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Put a limit of link reset retries If an OOB event is received but the phy still fails to come up, a link reset will be issued repeatedly at an interval of 20s until the phy comes up. Set a limit for link reset issue retries to avoid printing the timeout message endlessly. Link: https://lore.kernel.org/r/1623058179-80434-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f4df167a |
|
06-Apr-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Print SATA device SAS address for soft reset failure Add (pseudo) SAS address for ATA software reset failure log to assist in debugging. Link: https://lore.kernel.org/r/1617709711-195853-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2c74cb1f |
|
06-Apr-2021 |
Jianqin Xie <xiejianqin@hisilicon.com> |
scsi: hisi_sas: Directly snapshot registers when executing a reset The debugfs snapshot should be executed before the reset occurs to ensure that the register contents are saved properly. As such, it is incorrect to queue the debugfs dump when running a reset as the reset will occur prior to the snapshot work item is handler. Therefore, directly snapshot registers in the reset work handler. Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.com Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com> Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f4676665 |
|
06-Apr-2021 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() fails Function sas_unregister_ha() needs to be called to roll back if hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or hisi_sas_v3_probe(). Make that change. Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1dbe61bf |
|
26-Jan-2021 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Enable debugfs support by default Add a config option to enable debugfs support by default. And if debugfs support is enabled by default, dump count default value is increased to 50 as generally users want something bigger than the current default in that situation. Link: https://lore.kernel.org/r/1611659068-131975-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
69bfa5fd |
|
26-Jan-2021 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Don't check .nr_hw_queues in hisi_sas_task_prep() Now that v2 and v3 hw expose their HW queues (and so shost.nr_hw_queues is set), remove the conditional checks in hisi_sas_task_prep(). This change would affect v1 HW performance (as it does not expose HW queues), but nobody uses it and support may be dropped soon. Link: https://lore.kernel.org/r/1611659068-131975-3-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
872a90b5 |
|
18-Jan-2021 |
Ahmed S. Darwish <a.darwish@linutronix.de> |
scsi: hisi_sas: Switch back to original libsas event notifiers libsas event notifiers required an extension where gfp_t flags must be explicitly passed. For bisectability, a temporary _gfp() variant of such functions were added. All call sites then got converted use the _gfp() variants and explicitly pass GFP context. Having no callers left, the original libsas notifiers were then modified to accept gfp_t flags by default. Switch back to the original libas API, while still passing GFP context. The libsas _gfp() variants will be removed afterwards. Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
26c7efc3 |
|
18-Jan-2021 |
Ahmed S. Darwish <a.darwish@linutronix.de> |
scsi: hisi_sas: Pass gfp_t flags to libsas event notifiers Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. Below are the context analysis for modified functions: => hisi_sas_bytes_dmaed(): Since it is invoked from both process and atomic contexts, let its callers pass the gfp_t flags: * hisi_sas_main.c: ------------------ hisi_sas_phyup_work(): workqueue context -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_controller_reset_done(): has an msleep() -> hisi_sas_rescan_topology() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) * hisi_sas_v1_hw.c: ------------------- int_abnormal_v1_hw(): irq handler -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) * hisi_sas_v[23]_hw.c: ---------------------- int_phy_updown_v[23]_hw(): irq handler -> phy_down_v[23]_hw() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) => int_bcast_v1_hw() and phy_bcast_v3_hw(): Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC. Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
121181f3 |
|
18-Jan-2021 |
John Garry <john.garry@huawei.com> |
scsi: libsas: Remove notifier indirection LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
74a29219 |
|
02-Dec-2020 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Expose HW queues for v2 hw As a performance enhancement, make the completion queue interrupts managed. In addition, in commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline"), CPU hotplug for MQ devices using managed interrupts is made safe. So expose HW queues to blk-mq to take advantage of this. Flag Scsi_host.host_tagset is also set to ensure that the HBA is not sent more commands than it can handle. However the driver still does not use request tag for IPTT as there are many HW bugs means that special rules apply for IPTT allocation. Link: https://lore.kernel.org/r/1606905417-183214-6-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
359db633 |
|
07-Dec-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Select a suitable queue for internal I/Os For when managed interrupts are used (and shost->nr_hw_queues is set), a fixed queue - set per-device - is still used for internal I/Os. If all the CPUs mapped to that queue are offlined, then the completions for that queue are not serviced and any internal I/Os will time out. Fix by selecting a queue for internal I/Os from the queue mapped from the current CPU in this scenario. This is still not ideal as it does not deal with CPU hotplug for inflight internal I/Os, and needs proper support from [0]. [0] https://lore.kernel.org/linux-scsi/20200703130122.111448-1-hare@suse.de/T/#m7d77d049b18f33a24ef206af69ebb66d07440556 Link: https://lore.kernel.org/r/1607347855-59091-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
18577cdc |
|
26-Nov-2020 |
Ahmed S. Darwish <a.darwish@linutronix.de> |
scsi: hisi_sas: Remove preemptible() hisi_sas_task_exec() uses preemptible() to see if it's safe to block. This does not work for CONFIG_PREEMPT_COUNT=n kernels in which preemptible() always returns 0. The problem is masked when enabling some of the common Kconfig.debug options (like CONFIG_DEBUG_ATOMIC_SLEEP), as they implicitly enable the preemption counter. In general, driver leaf functions should not make logic decisions based on the context they're called from. The caller should be the entity responsible for explicitly indicating context. Since hisi_sas_task_exec() already has a gfp_t flags parameter, use it as the explicit context marker. Link: https://lore.kernel.org/r/20201126132952.2287996-3-bigeasy@linutronix.de Fixes: 214e702d4b70 ("scsi: hisi_sas: Adjust task reject period during host reset") Fixes: 550c0d89d52d ("scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()") Cc: Xiaofei Tan <tanxiaofei@huawei.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Cc: John Garry <john.garry@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
623a4b6d |
|
24-Nov-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Move debugfs code to v3 hw driver Relocate all the debugfs code for DFX to v3 hw since no other versions support it. Link: https://lore.kernel.org/r/1606207594-196362-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fab09aae |
|
15-Oct-2020 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Stop using queue #0 always for v2 hw In commit 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ"), the dispatch function was changed to choose the delivery queue based on the request tag HW queue index. This heavily degrades performance for v2 hw, since the HW queues are not exposed there, and, as such, HW queue #0 is used for every command. Revert to previous behaviour for when nr_hw_queues is not set, that being to choose the HW queue based on target device index. Link: https://lore.kernel.org/r/1602750425-240341-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
69f4ec1e |
|
02-Oct-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Recover PHY state according to the status before reset Currently the PHY state is set according to the state of the PHYs after reset. This is invalid as the PHYs are already re-initialized. Set PHY state according to the state before the reset instead of after. Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b14a37e0 |
|
02-Oct-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Filter out new PHY up events during suspend Currently sas_resume_ha() is called while resuming the controller to wait for all suspended PHYs to come up and all the libsas events to be completed. There is a scenario which will cause task hung: For direct attach with two disks connected with two PHYs, disable phy0 before suspending the disk on phy1 and the controller, then enable phy0 and resume the controller, and task hung occurs as follows: [ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0] [ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined [ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume [ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata) [ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200) [ 593.148350] sas: DOING DISCOVERY on port 0, pid:948 [ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found [ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [ 593.165663] sas: ata7: end_device-2:0: dev error handler [ 593.165730] sas: ata2: end_device-2:1: dev error handler [ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2 [ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down [ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133 [ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32) [ 593.514295] ata7.00: configured for UDMA/133 [ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005 serial:S45NNA0M712225 [ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0 [ 593.543674] device_link_add 324 [ 593.546801] device_link_add 352 [ 593.549930] device_link_add 406 [ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0 [ 593.559208] device_link_add 444 [ 593.562335] device_link_add 455 [ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0 ANSI: 5 [ 620.057464] phy-2:1: resume timeout [ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds. [ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.862155] kworker/u256:0 D 0 8 2 0x00000028 [ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker [ 738.873693] Call trace: [ 738.876133] __switch_to+0xf4/0x148 [ 738.879613] __schedule+0x270/0x5d8 [ 738.883091] schedule+0x78/0x110 [ 738.886307] schedule_timeout+0x1ac/0x280 [ 738.890299] wait_for_completion+0x94/0x138 [ 738.894472] flush_workqueue+0x114/0x438 [ 738.898377] sas_porte_bytes_dmaed+0x400/0x500 [ 738.902801] sas_port_event_worker+0x28/0x40 [ 738.907053] process_one_work+0x1e8/0x360 [ 738.911046] worker_thread+0x44/0x478 [ 738.914698] kthread+0x150/0x158 [ 738.917915] ret_from_fork+0x10/0x1c [ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds. [ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.942408] kworker/u256:1 D 0 948 2 0x00000028 [ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain [ 738.953766] Call trace: [ 738.956203] __switch_to+0xf4/0x148 [ 738.959678] __schedule+0x270/0x5d8 [ 738.963152] schedule+0x78/0x110 [ 738.966368] rpm_resume+0xcc/0x550 [ 738.969757] __pm_runtime_resume+0x3c/0x88 [ 738.973836] rpm_get_suppliers+0x50/0x148 [ 738.977829] __pm_runtime_set_status+0x124/0x2f0 [ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8 [ 738.986679] scsi_probe_and_add_lun+0x888/0xab0 [ 738.991190] __scsi_scan_target+0xec/0x520 [ 738.995268] scsi_scan_target+0x11c/0x128 [ 738.999261] sas_rphy_add+0x15c/0x1e8 [ 739.002907] sas_probe_devices+0xe4/0x150 [ 739.006899] sas_discover_domain+0x33c/0x588 [ 739.011150] process_one_work+0x1e8/0x360 [ 739.015143] worker_thread+0x44/0x478 [ 739.018789] kthread+0x150/0x158 [ 739.022003] ret_from_fork+0x10/0x1c ... If an extra phy0 up happens during resume of the SAS controller, it will emit a new libsas event (event PORTE_BYTES_DMAED and event DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to resume supplier (host controller). For runtime PM core, if device is in the resuming state, the later resume request of the device will wait for previous resume request to complete synchronously. At that point in time the state of the controller is still resuming as it waits for all libsas events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as the state of the controller is resuming which causes a deadlock. To avoid the issue, filter out new PHY up events while the controller is suspended. Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8d98416a |
|
19-Aug-2020 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Switch v3 hw to MQ Now that the block layer provides a shared tag, we can switch the driver to expose all HW queues. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
26f84f9b |
|
01-Sep-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Code style cleanup Remove extra blank lines and add spaces around operators. Link: https://lore.kernel.org/r/1598958790-232272-9-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b601577d |
|
01-Sep-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Add missing newlines Newline is missing from some printk() statements. Add them. Link: https://lore.kernel.org/r/1598958790-232272-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
981cc23e |
|
01-Sep-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add BIST support for fixed code pattern Through the new debugfs interface the user can select fixed code patterns. Add two new interfaces fixed_code and fixed_code1. Link: https://lore.kernel.org/r/1598958790-232272-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2c4d5823 |
|
01-Sep-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add BIST support for phy FFE Add BIST support for phy FFE (Feed forward equalizer) setting. The user can configure FFE through the new debugfs interface. FFE is a parameter used for link layer control. It will affect the link quality between the SAS controller and the backplane. In the BIST test, the FFE interface is provided to assist board testers in optimizing link parameters. The modification of the FFE parameter will affect the test after BIST or the normal running of the board. The user should save the initial FFE values and restore them after BIST test is complete. Link: https://lore.kernel.org/r/1598958790-232272-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
847e8355 |
|
01-Sep-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Avoid accessing to SSP task for SMP I/Os hisi_sas_slot_task_free() attempts to dereference SSP task for non-ATA tasks. If the task is SMP, the code may reference the wrong structure although this may not cause any problems. To avoid this, only access to SSP task when slot->n_elem_dif is not 0 which indicates this is an SSP task. Link: https://lore.kernel.org/r/1598958790-232272-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
df561f66 |
|
23-Aug-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
#
e16b9ed6 |
|
15-May-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Do not reset phy timer to wait for stray phy up We found out that after phy up, the hardware reports another oob interrupt but did not follow a phy up interrupt: oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout We run link reset when wait phy up timeout, and it send a normal disk into reset processing. So we made some circumvention action in the code, so that this abnormal oob interrupt will not start the timer to wait for phy up. Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
11e67320 |
|
20-Jan-2020 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask In future we will want to use hisi_sas_cq.pci_irq_mask for non-pci interrupt masks, so rename to be more general. Link: https://lore.kernel.org/r/1579522957-4393-7-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3cd2f3c3 |
|
20-Jan-2020 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Modify the file permissions of trigger_dump to write only The trigger_dump file is only used to manually trigger the dump, and did not provide a read callback function for it, so its file permission setting to 600 is wrong,and should be changed to 200. Link: https://lore.kernel.org/r/1579522957-4393-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e9dc5e11 |
|
20-Jan-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock After changing tasklet to workqueue or threaded irq, some critical resources are only used on threads (not in interrupt or bottom half of interrupt), so replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock to protect those critical resources. Link: https://lore.kernel.org/r/1579522957-4393-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
81f338e9 |
|
20-Jan-2020 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: use threaded irq to process CQ interrupts Currently IRQ_EFFECTIVE_AFF_MASK is enabled for ARM_GIC and ARM_GIC3, so it only allows a single target CPU in the affinity mask to process interrupts and also interrupt thread, and the performance of using threaded irq is almost the same as tasklet. But if the config is not enabled, the interrupt thread will be allowed all the CPUs in the affinity mask. At that situation it improves the performance (about 20%). Note: IRQ_EFFECTIVE_AFF_MASK is configured differently for different architecture chip, and it seems to be better to make it be configured easily. Link: https://lore.kernel.org/r/1579522957-4393-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
964231aa |
|
12-Nov-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Stop converting a bool into a bool The !! operator on a bool is pointless, so remove an example in hisi_sas_rescan_topology(). Link: https://lore.kernel.org/r/1573551059-107873-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8c39673d |
|
12-Nov-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Check sas_port before using it Need to check the structure sas_port before using it. Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f873b661 |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Record the phy down event in debugfs The number of phy down reflects the quality of the link between SAS controller and disk. In order to allow the user to confirm the link quality of the system, we record the number of phy down for each phy. The user can check the current phy down count by reading the debugfs file corresponding to the specific phy, or clear the phy down count by writing 0 to the debugfs file. Link: https://lore.kernel.org/r/1571926105-74636-19-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cabe7c10 |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Delete the debugfs folder of hisi_sas when the probe fails Although if the debugfs initialization fails, we will delete the debugfs folder of hisi_sas, but we did not consider the scenario where debugfs was successfully initialized, but the probe failed for other reasons. We found out that hisi_sas folder is still remain after the probe failed. When probe fail, we should delete debugfs folder to avoid the above issue. Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8f643298 |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add ability to have multiple debugfs dumps We use the module parameter debugfs_dump_count to manage the upper limit of the memory block for multiple dumps. Link: https://lore.kernel.org/r/1571926105-74636-17-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
905ab01f |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add module parameter for debugfs dump count We still only use dump index #0 however. Link: https://lore.kernel.org/r/1571926105-74636-16-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a70e33ea |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Allocate memory for multiple dumps of debugfs We add multiple dumps for debugfs, but only allocate memory this time and only dump #0. Link: https://lore.kernel.org/r/1571926105-74636-15-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
357e4fc7 |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for ITCT cache Create a file structure which was used to save the memory address for ITCT cache at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-14-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b714dd8f |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for IOST cache Create a file structure which was used to save the memory address for IOST cache at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-13-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0161d55f |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for ITCT Create a file structure which was used to save the memory address for ITCT at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-12-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e15f2e2d |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for IOST Create a file structure which was used to save the memory address for IOST at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-11-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1f66e1fd |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for port Create a file structure which was used to save the memory address and phy pointer for port at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-10-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c6116398 |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for registers Create a file structure which was used to save the memory address and hisi_hba pointer for REGS at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-9-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1b54c4db |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for DQ Create a file structure which was used to save the memory address and DQ pointer for DQ at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-8-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
35ea630b |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs file structure for CQ Create a file structure which was used to save the memory address and CQ pointer for CQ at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d28ed83b |
|
24-Oct-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add timestamp for a debugfs dump It's useful to know when the dump occurred, so add a timestamp file for this. Link: https://lore.kernel.org/r/1571926105-74636-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
550c0d89 |
|
24-Oct-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec() For IOs from upper layer, preemption may be disabled as it may be called by function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function hisi_sas_task_exec(), it may disable preempt twice after down() and up() which will cause following call trace: BUG: scheduling while atomic: fio/60373/0x00000002 Call trace: dump_backtrace+0x0/0x150 show_stack+0x24/0x30 dump_stack+0xa0/0xc4 __schedule_bug+0x68/0x88 __schedule+0x4b8/0x548 schedule+0x40/0xd0 schedule_timeout+0x200/0x378 __down+0x78/0xc8 down+0x54/0x70 hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] sas_queuecommand+0x168/0x1b0 [libsas] scsi_queue_rq+0x2ac/0x980 blk_mq_dispatch_rq_list+0xb0/0x550 blk_mq_do_dispatch_sched+0x6c/0x110 blk_mq_sched_dispatch_requests+0x114/0x1d8 __blk_mq_run_hw_queue+0xb8/0x130 __blk_mq_delay_run_hw_queue+0x1c0/0x220 blk_mq_run_hw_queue+0xb0/0x128 blk_mq_sched_insert_requests+0xdc/0x208 blk_mq_flush_plug_list+0x1b4/0x3a0 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x3c/0x50 blkdev_direct_IO+0x404/0x550 generic_file_read_iter+0x9c/0x848 blkdev_read_iter+0x50/0x78 aio_read+0xc8/0x170 io_submit_one+0x1fc/0x8d8 __arm64_sys_io_submit+0xdc/0x280 el0_svc_common.constprop.0+0xe0/0x1e0 el0_svc_handler+0x34/0x90 el0_svc+0x10/0x14 ... To solve the issue, check preemptible() to avoid disabling preempt multiple when flag HISI_SAS_REJECT_CMD_BIT is set. Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8fa9a7bd |
|
24-Oct-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: use wait_for_completion_timeout() when clearing ITCT When injecting 2bit ecc errors, it will cause confusion inside SAS controller which needs host reset to recover it. If a device is gone at the same times inject 2bit ecc errors, we may not receive the ITCT interrupt so it will wait for completion in clear_itct_v3_hw() all the time. And host reset will also not occur because it can't require hisi_hba->sem, so the system will be suspended. To solve the issue, use wait_for_completion_timeout() instead of wait_for_completion(), and also don't mark the gone device as SAS_PHY_UNUSED when device gone. Link: https://lore.kernel.org/r/1571926105-74636-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
35160421 |
|
24-Oct-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Don't create debugfs dump folder twice Due to a merge error, we attempt to create 2x debugfs dump folders, which fails: [ 861.101914] debugfs: Directory 'dump' with parent '0000:74:02.0' already present! This breaks the dump function. To fix, remove the superfluous attempt to create the folder. Fixes: 7ec7082c57ec ("scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation") Link: https://lore.kernel.org/r/1571926105-74636-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b1000fcc |
|
16-Sep-2019 |
Colin Ian King <colin.king@canonical.com> |
scsi: hisi_sas: fix spelling mistake "digial" -> "digital" There is a spelling mistake in literal string. Fix it. Link: https://lore.kernel.org/r/20190916091706.32268-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4b6b1bb6 |
|
22-Sep-2019 |
YueHaibing <yuehaibing@huawei.com> |
scsi: hisi_sas: Make three functions static Fix sparse warnings: drivers/scsi/hisi_sas/hisi_sas_main.c:3686:6: warning: symbol 'hisi_sas_debugfs_release' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3708:5: warning: symbol 'hisi_sas_debugfs_alloc' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3799:6: warning: symbol 'hisi_sas_debugfs_bist_init' was not declared. Should it be static? Link: https://lore.kernel.org/r/20190923054035.19036-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e74006ed |
|
06-Sep-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Fix the conflict between device gone and host reset When device gone, it will check whether it is during reset, if not, it will send internal task abort. Before internal task abort returned, reset begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will call hisi_sas_init_device(), but at that time domain_device may already be freed or part of it is freed, so it may referenece null pointer in hisi_sas_init_device(). It may occur as follows: thread0 thread1 hisi_sas_dev_gone() check whether in RESET(no) internal task abort reset prep soft_reset ... (part of reset_done) internal task abort failed release resource anyway clear_itct device->lldd_dev=NULL hisi_sas_reset_init_all_device check sas_dev->dev_type is SAS_PHY_UNUSED and !device set dev_type SAS_PHY_UNUSED sas_free_device hisi_sas_init_device ... Semaphore hisi_hba.sema is used to sync the processes of device gone and host reset. To solve the issue, expand the scope that semaphore protects and let them never occur together. And also some places will check whether domain_device is NULL to judge whether the device is gone. So when device gone, need to clear sas_dev->sas_device. Link: https://lore.kernel.org/r/1567774537-20003-14-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
97b151e7 |
|
06-Sep-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Add BIST support for phy loopback Add BIST (built in self test) support for phy loopback. Through the new debugfs interface, the user can configure loopback mode/linkrate/phy id/code mode before enabling it. And also user can enable/disable BIST function. Link: https://lore.kernel.org/r/1567774537-20003-13-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7ec7082c |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation We extract the code of memory allocate and construct an new function for it. We think it's convenient for subsequent optimization. Link: https://lore.kernel.org/r/1567774537-20003-12-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4bc05809 |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Remove some unused function arguments Some function arguments are unused, so remove them. Also move the timeout print in for wait_cmds_complete_timeout_vX_hw() callsites into that same function. Link: https://lore.kernel.org/r/1567774537-20003-11-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
435a05cf |
|
06-Sep-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Assign NCQ tag for all NCQ commands Currently the NCQ tag is only assigned for FPDMA READ and FPDMA WRITE commands, and for other NCQ commands (such as FPDMA SEND), their NCQ tags are set in the delivery command to 0. So for all the NCQ commands, we also need to assign normal NCQ tag for them, so drop the command type check in hisi_sas_get_ncq_tag() [drop hisi_sas_get_ncq_tag() altogether actually], and always use the ATA command NCQ tag when appropriate. Link: https://lore.kernel.org/r/1567774537-20003-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b45e05aa |
|
06-Sep-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device When init device for SAS disks, it will send TMF IO to clear disks. At that time TMF IO is broken by some operations such as injecting controller reset from HW RAs event, the TMF IO will be timeout, and at last device will be gone. Print is as followed: hisi_sas_v3_hw 0000:74:02.0: dev[240:1] found ... hisi_sas_v3_hw 0000:74:02.0: controller resetting... hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=10(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=10(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11 hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11 hisi_sas_v3_hw 0000:74:02.0: controller reset complete hisi_sas_v3_hw 0000:74:02.0: abort tmf: TMF task timeout and not done hisi_sas_v3_hw 0000:74:02.0: dev[240:1] is gone sas: driver on host 0000:74:02.0 cannot handle device 5000c500a75a860d, error:5 To improve the reliability, retry TMF IO max of 3 times for SAS disks which is the same as softreset does. Link: https://lore.kernel.org/r/1567774537-20003-6-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
76dd768b |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails At expander environment, we delay after issue phy reset to wait for hardware to handle phy reset. But if sas_smp_phy_control() fails, the delay is unnecessary so remove it. Link: https://lore.kernel.org/r/1567774537-20003-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c2bae4f7 |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled At hisi_sas_debug_I_T_nexus_reset(), we call sas_phy_reset() to reset a phy. But if the phy is disabled, sas_phy_reset() will directly return -ENODEV without issue a phy reset request. If so, We can directly return -ENODEV to libsas before issue a phy reset. Link: https://lore.kernel.org/r/1567774537-20003-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
af01b2b9 |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset() When calling sas_phy_reset(), we need to specify whether the reset type is hard reset or link reset - use true/false for clarity. Link: https://lore.kernel.org/r/1567774537-20003-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7105e68a |
|
06-Sep-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: add debugfs auto-trigger for internal abort time out This trigger is add at _hisi_sas_internal_task_abort() Link: https://lore.kernel.org/r/1567774537-20003-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c0c1a71e |
|
04-Sep-2019 |
YueHaibing <yuehaibing@huawei.com> |
scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code Use devm_platform_ioremap_resource() to simplify the code a bit. This is detected by coccinelle. Link: https://lore.kernel.org/r/20190904130256.24704-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a5ac1f5d |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Consolidate internal abort calls in LU reset operation In hisi_sas_lu_reset(), we call internal abort for SAS and SATA device codepaths -> consolidate into a single call. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e7513f66 |
|
05-Aug-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: replace "%p" with "%pK" The format specifier "%p" can leak kernel address, and use "%pK" instead. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a07b4876 |
|
05-Aug-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remove some unnecessary code Remove some unnecessary code, including: - Explicit zeroing of memory allocated for dmam_alloc_coherent() - Some duplicated code - Some redundant masking Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7bf18e84 |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Modify return type of debugfs functions For functions which always return 0, which is never checked, make to return void. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5f6c32d7 |
|
05-Aug-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Drop SMP resp frame DMA mapping The SMP frame response is written to the command table and not the SMP response pointer from libsas, so don't bother DMA mapping (and unmapping) the SMP response from libsas. Suggested-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
599aefc8 |
|
05-Aug-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Make slot buf minimum allocation of PAGE_SIZE For a system with PAGE_SIZE of 16K or 64K, the size every time we want to alloc may be small like 4K, but for function dmam_alloc_coherent(), the least size it allocates is PAGE_SIZE, so it will waste much memory for the situation. To solve the issue, limit the minimum allocation size of slot buf to PAGE_SIZE. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d380f555 |
|
05-Aug-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Don't bother clearing status buffer IU in task prep For struct hisi_sas_status_buffer, it contains struct hisi_sas_err_record and iu[1024]. The struct iu[1024] will be filled fully by the response of disks, so it is not need to initialize them to 0, but for the struct hisi_sas_err_record, SAS controller only fill some fields of hisi_sas_err_record according to hw designer, so it should be initialised to 0. After the change, cpu utilization percentage of memset() is changed from 1.7% to 0.12%. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
445ee2de |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset() Fix a possible out-of-bounds access in hisi_sas_debug_I_T_nexus_reset(). Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b0b3e429 |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Snapshot AXI and RAS register at debugfs The AXI and RAS register values should also should be snapshot at debugfs. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bbe0a7b3 |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Snapshot HW cache of IOST and ITCT at debugfs The value of IOST/ITCT is updated to cache first, and then synchronize to DDR periodically. So the value in IOST/ITCT cache is the latest data and it's important for debugging. So, the HW cache of IOST and ITCT should be snapshot at debugfs. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bee0cf25 |
|
05-Aug-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Fix pointer usage error in show debugfs IOST/ITCT Fix how the pointer is set in hisi_sas_debugfs_iost_show() and hisi_sas_debugfs_itct_show(). Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
897cc769 |
|
05-Aug-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Drop hisi_sas_hw.get_free_slot In commit 1273d65f29045 ("scsi: hisi_sas: change queue depth from 512 to 4096"), the depth of each queue is the same as the max IPTT in the system. As such, as long as we have an IPTT allocated, we will have enough space on any delivery queue. All .get_free_slot functions were checking for space on the queue by reading the DQ read pointer. Drop this, and also raise the code into common code, as there is nothing hw specific remaining. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
93352abc |
|
05-Aug-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Make max IPTT count equal for all hw revisions There is a small optimisation to be had by making the max IPTT the same for all hw revisions, that being we can drop the check for read and write pointer being the same in the get free slot function. Change v1 hw to have max IPTT of 4096 - same as v2 and v3 hw - and drop hisi_sas_hw.max_command_entries. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
924a3541 |
|
10-Jun-2019 |
John Garry <john.garry@huawei.com> |
scsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander() Many times in libsas, and in LLDDs which use libsas, the check for an expander device is re-implemented or open coded. Use dev_is_expander() instead. We rename this from sas_dev_type_is_expander() to not spill so many lines in referencing. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6c86e046 |
|
29-May-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Delete PHY timers when rmmod or probe failed When removing the driver or when probe fails, we need to delete the PHY timers. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2874c5fd |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
01d4e3a2 |
|
11-Apr-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Some misc tidy-up Do some minor tidy-up. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
246ea3c0 |
|
11-Apr-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Don't fail IT nexus reset for Open Reject timeout Currently we call hisi_sas_softreset_ata_disk() in hisi_sas_I_T_nexus_reset(). If this fails for open reject reason, there is no reason to fail the IT nexus reset, so only fail for TMF_RESP_FUNC_FAILED. Some other strings spilled over multiple lines are reunited. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a3115700 |
|
11-Apr-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Don't hard reset disk during controller reset In the function of hisi_sas_init_device(), we added ops->hardreset() to clear affiliation of STP target port or handle [STP pending] state. Function hisi_sas_init_device() will be called when a device is found or during controller reset. At controller reset, we call hisi_sas_init_device() to re-init the disks, so calling hardreset() is unnecessary and it also will cause some delay at controller reset. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
18a54b32 |
|
11-Apr-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Adjust the printk format of functions hisi_sas_init_device() In function hisi_sas_init_device(), the log is as follows when error for hardreset: hisi_sas_v3_hw 0000:74:02.0: SATA disk hardreset fail: 0xffffffed Actually if hardreset failed, its return value is negative, so change the print format from %x to %d. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c63b88cc |
|
11-Apr-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected In commit efdcad62e7b8 ("scsi: hisi_sas: Set PHY linkrate when disconnected"), we use the sas_phy_data.enable flag to track whether the PHY was enabled or not, so that we know if we should set the PHY negotiated linkrate at SAS_LINK_RATE_UNKNOWN or SAS_PHY_DISABLED. However, it is not proper to use sas_phy_data.enable, since it is only set when libsas attempts to set the PHY disabled/enabled; hence, it may not even have an initial value. As a solution to this problem, introduce hisi_sas_phy.enable to track whether the PHY is enabled or not, so that we can set the negotiated linkrate properly when the PHY comes down. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
447f78c0 |
|
11-Apr-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remedy inconsistent PHY down state in software Currently there are two scenarioes which may cause PHY state of hardware (which is 0) is inconsistent with the state held in software: - Unplug SAS wire before get_phys_state when SAS controller reset, then the interrupts of phy down are ignored, phy state is 0 before reset, and it also gets 0 after reset, so phy down doesn't occur even if unplugged SAS wire; - For v3 hw later version, it will close bus when 2 bit ECC error occurs. So if unplug SAS wire at that time, interrupts of phy down also not occur. So at last it will cause host reset. It also get phy state 0 before and after reset, the same issue occurs. To solve it, use hisi_sas_phy_down() directly in rescan topology function. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a97fa586 |
|
11-Apr-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add host reset interface for test Add host reset interface to make it easier for testing the host reset feature. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0e83fc61 |
|
20-Mar-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add softreset in hisi_sas_I_T_nexus_reset() We found out that for v2 hw, a SATA disk can not be written to after the system comes up. In commit ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an internal abort for a SATA device, but without following it with a softreset. We need to always follow an internal abort with a software reset, as per HW programming flow, so add this. Fixes: ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()") Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
57dbb2b2 |
|
28-Feb-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port If we exchange SAS expander from one SAS controller to other SAS controller without powering it down, the STP target port will maintain previous affiliation and reject all subsequent connection requests from other STP initiator ports with OPEN_REJECT (STP RESOURCES BUSY). To solve this issue, send HARD RESET to clear the previous affiliation of STP target port according to SPL (chapter 6.19.4). We (re-)introduce dev status flag to know if to sleep in NEXUS reset code or not for remote PHYs. The idea is that if the device is being initialised, we don't require the delay, and caller would wait for link to be established, cf. sas_ata_hard_reset(). Co-developed-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
efdcad62 |
|
28-Feb-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Set PHY linkrate when disconnected When the PHY comes down, we currently do not set the negotiated linkrate: root@(none)$ pwd /sys/class/sas_phy/phy-0:0 root@(none)$ more enable 1 root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ echo 0 > enable root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ This patch fixes the driver code to set it properly when the PHY comes down. If the PHY had been enabled, then set unknown; otherwise, flag as disabled. The logical place to set the negotiated linkrate for this scenario is PHY down routine, which is called from the PHY down ISR. However, it is not possible to know if the PHY comes down due to PHY disable or loss of link, as sas_phy.enabled member is not set until after the transport disable routine is complete, which races with the PHY down ISR. As an imperfect solution, use sas_phy_data.enable as the flag to know if the PHY is down due to disable. It's imperfect, as sas_phy_data is internal to libsas. I can't see another way without adding a new field to hisi_sas_phy and managing it, or changing SCSI SAS transport. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
47905957 |
|
28-Feb-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8 [ 729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc [ 729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c [ 729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc [ 729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0 [ 729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34 [ 729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0 [ 729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138 [ 729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d9a00459 |
|
18-Feb-2019 |
Hannes Reinecke <hare@suse.de> |
scsi: hisi_sas: fix calls to dma_set_mask_and_coherent() The change to use dma_set_mask_and_coherent() incorrectly made a second call with the 32 bit DMA mask value when the call with the 64 bit DMA mask value succeeded. [mkp: fixed commit message] Fixes: e4db40e7a1a2 ("scsi: hisi_sas: use dma_set_mask_and_coherent") Cc: <stable@vger.kernel.org> Suggested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4a8bec88 |
|
06-Feb-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Do some more tidy-up Do some very minor tidy-up, for things like needlessly initing variable and not leaving whitespace before quote endings. Originally-from: Xiang Chen <chenxiang66@hisilicon.com> Originally-from: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4fefe5bb |
|
06-Feb-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental For auto-control irq affinity mode, choose the dq to deliver IO according to the current CPU. Then it decreases the performance regression that fio and CQ interrupts are processed on different node. For user control irq affinity mode, keep it as before. To realize it, also need to distinguish the usage of dq lock and sas_dev lock. We mark as experimental due to ongoing discussion on managed MSI IRQ during hotplug: https://marc.info/?l=linux-scsi&m=154876335707751&w=2 We're almost at the point where we can expose multiple queues to the upper layer for SCSI MQ, but we need to sort out the per-HBA tags performance issue. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
795f25a3 |
|
06-Feb-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Issue internal abort on all relevant queues To support queue mapped to a CPU, it needs to be ensured that issuing an internal abort is safe, in that it is guaranteed that an internal abort is processed for a single IO or a device after all the relevant command(s) which it is attempting to abort have been processed by the controller. Currently we only deliver commands for any device on a single queue to solve this problem, as we know that commands issued on the same queue will be processed in order, and we will not have a scenario where the internal abort is racing against a command(s) which it is trying to abort. To enqueue commands on queue mapped to a CPU, choosing a queue for an command is based on the associated queue for the current CPU, so this is not safe for internal abort since it would definitely not be guaranteed that commands for the command devices are issued on the same queue. To solve this issue, we take a bludgeoning approach, and issue a separate internal abort on any queue(s) relevant to the command or device, in that we will be guaranteed that at least one of these internal aborts will be received last in the controller. So, for aborting a single command, we can just force the internal abort to be issued on the same queue as the command which we are trying to abort. For aborting all commands associated with a device, we issue a separate internal abort on all relevant queues. Issuing multiple internal aborts in this fashion would have not side affect. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7c5e1363 |
|
06-Feb-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add manual trigger for debugfs dump Add an interface to manually trigger a debugfs dump. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b3cce125 |
|
06-Feb-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Add support for DIX feature for v3 hw This patch adds support for DIX to v3 hw driver. For this, we build upon support for DIF, most significantly is adding new DMA map and unmap paths. Some pre-existing macro precedence issues are also tidied. They were detected by checkpatch --strict. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ede2afb9 |
|
25-Jan-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Add missing seq_printf() call in hisi_sas_show_row_32() This call must have been missed when I reworked the debugfs feature for upstreaming, so add it back. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
26889e5e |
|
25-Jan-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Some misc tidy-up Sparse detected some problems in the driver, so tidy them up. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d1548e9c |
|
25-Jan-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Correct memory allocation size for DQ debugfs Some sizes we allocate for debugfs structure are incorrect, so fix them. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b6c9b15e |
|
25-Jan-2019 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Fix losing directly attached disk when hot-plug Hot-plugging SAS wire of direct hard disk backplane may cause disk lost. We have done this test with several types of SATA disk from different venders, and only two models from Seagate has this problem, ST4000NM0035-1V4107 and ST3000VM002-1ET166. The root cause is that the disk doesn't send D2H frame after OOB finished. SAS controller will issue phyup interrupt only when D2H frame is received, otherwise, will be waiting there all the time. When this issue happen, we can find the disk again with link reset. To fix this issue, we setup an timer after OOB finished. If the PHY is not up in 20s, do link reset. Notes: the 20s is an experience value. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
eb44e4d7 |
|
25-Jan-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Reject setting programmed minimum linkrate > 1.5G The SAS controller cannot support a programmed minimum linkrate of > 1.5G (it will always negotiate to 1.5G at least), so just reject it. This solves a strange situation where the PHY negotiated linkrate may be less than the programmed minimum linkrate. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ae68b566 |
|
25-Jan-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Remove unused parameter of function hisi_sas_alloc() In function hisi_sas_alloc(), parameter shost is not used, so remove it. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ffb1c820 |
|
25-Jan-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset() When issing a hardreset to a SATA device when running IO, it is possible that abnormal CQs of the device are returned. Then enter error handler, it doesn't enter function hisi_sas_abort_task() as there is no timeout IO, and it doesn't set device as HISI_SAS_DEV_EH. So when hardreset by libata later, it actually doesn't issue hardreset as there is a check to judge whether device is in error. For this situation, actually need to hardreset the device to recover. So remove the check of sas_dev status in hisi_sas_I_T_nexus_reset(). Before we add the check to avoid the endless loop of reset for directly-attached SATA device at probe time, actually we flutter it for it, so it is not necessary to add the check now. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
569eddcf |
|
25-Jan-2019 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: send primitive NOTIFY to SSP situation only Send primitive NOTIFY to SSP situation only, or it causes underflow issue when sending IO. Also rename hisi_sas_hw.sl_notify() to hisi_sas_hw. sl_notify_ssp(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5979f33b |
|
25-Jan-2019 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs ITCT file and add file operations This patch creates debugfs file for ITCT and adds file operations. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5b0eeac4 |
|
25-Jan-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix type casting and missing static qualifier in debugfs code Sparse can detect some type casting issues in the debugfs code, so fix it up. Also a missing static qualifier is added to hisi_sas_debugfs_to_reg_name(). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c2c7e740 |
|
25-Jan-2019 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: No need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1afb4b85 |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs IOST file and add file operations This patch create debugfs file for IOST and add file operations. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
148e379f |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs DQ file and add file operations This patch create debugfs file for DQ and add file operations Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
971afae7 |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs CQ file and add file operations This patch create debugfs file for CQ and add file operations. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
61a6ebf3 |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Add debugfs for port registers This patch create debugfs file for port register and add file operations. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
caefac19 |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Debugfs global register create file and add file operations This patch create debugfs file for global register and add file operations. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
49159a5e |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Take debugfs snapshot for all regs This patch takes snapshot for global regs, port regs, CQ, DQ, IOST, ITCT. Add code for snapshot trig and generate dump directory. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
eb1c2b72 |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Alloc debugfs snapshot buffer memory for all registers This patch allocates snapshot memory for global reg, port regs, CQ, DQ, IOST, ITCT. When we fail to allocate memory for some registers, we free the memory and set hisi_sas_debugfs_enable as 0 to stop loading debugfs from running. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ef63464b |
|
19-Dec-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Create root and device debugfs directories This patch creates root directory at hisi_sas_init() and generates device directory when we probe device driver. And we remove the root directory at hisi_sas_exit(), but recursively delete device directory when we remove device driver. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6db831f4 |
|
06-Dec-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Make sg_tablesize consistent value Sht->sg_tablesize is set in the driver, and it will be assigned to shost->sg_tablesize in SCSI mid-layer. So it is not necessary to assign shost->sg_table one more time in the driver. In addition to the change, change each scsi_host_template.sg_tablesize to HISI_SAS_SGE_PAGE_CNT instead of SG_ALL. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6e1b731b |
|
06-Dec-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Relocate some code to reduce complexity Relocate the codes related to dma_map/unmap in hisi_sas_task_prep() to reduce complexity, with a view to add DIF/DIX support. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
735bcc77 |
|
06-Dec-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Fix warnings detected by sparse This patchset fixes some warnings detected by the sparse tool, like these: drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: warning: incorrect type in assignment (different base types) drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: got restricted __le16 [usertype] <noident> drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: warning: incorrect type in assignment (different base types) drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: got restricted __le16 [usertype] <noident> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
745b6847 |
|
09-Nov-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Relocate some codes to avoid an unused check In function hisi_sas_task_prep(), we check asd_sas_port, but in function hisi_sas_task_exec(), we already refer to asd_sas_port by using function dev_to_hisi_hba() implicitly. So to avoid this possible invalid dereference, relocate the check to function hisi_sas_task_prep(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c3566f9a |
|
09-Nov-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Create separate host attributes per HBA Currently all the three HBA (v1/v2/v3 HW) share the same host attributes. To support each HBA having separate attributes in future, create per-HBA attributes. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f4445bb9 |
|
18-Oct-2018 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
scsi: hisi_sas: Fix NULL pointer dereference There is a NULL pointer dereference in case *slot* happens to be NULL at lines 1053 and 1878: struct hisi_sas_cq *cq = &hisi_hba->cq[slot->dlvry_queue]; Notice that *slot* is being NULL checked at lines 1057 and 1881: if (slot), which implies it may be NULL. Fix this by placing the declaration and definition of variable cq, which contains the pointer dereference slot->dlvry_queue, after slot has been properly NULL checked. Addresses-Coverity-ID: 1474515 ("Dereference before null check") Addresses-Coverity-ID: 1474520 ("Dereference before null check") Fixes: 584f53fe5f52 ("scsi: hisi_sas: Fix the race between IO completion and timeout for SMP/internal IO") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
784b46b7 |
|
24-Sep-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Use block layer tag instead for IPTT Currently we use the IPTT defined in LLDD to identify IOs. Actually for IOs which are from the block layer, they have tags to identify them. So for those IOs, use tag of the block layer directly, and for IOs which is not from the block layer (such as internal IOs from libsas/LLDD), reserve 96 IPTTs for them. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
584f53fe |
|
24-Sep-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Fix the race between IO completion and timeout for SMP/internal IO If SMP/internal IO times out, we will possibly free the task immediately. However if the IO actually completes at the same time, the IO completion may refer to task which has been freed. So to solve the issue, flush the tasklet to finish IO completion before free'ing slot/task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1668e3b6 |
|
24-Sep-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Move evaluation of hisi_hba in hisi_sas_task_prep() In evaluating hisi_hba, the sas_port may be NULL, so for safety relocate the the check to value possible NULL deference. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5a54691f |
|
24-Sep-2018 |
Luo Jiaxing <luojiaxing@huawei.com> |
scsi: hisi_sas: Feed back linkrate(max/min) when re-attached At directly attached situation, if the user modifies the sysfs interface of maximum_linkrate and minimum_linkrate to renegotiate the linkrate between SAS controller and target, the value of both files mentioned above should have change to user setting after renegotiate is over, but it remains unchanged. To fix this bug, maximum_linkrate and minimum_linkrate will be directly fed back to relevant sas_phy structure. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
640208a1 |
|
24-Sep-2018 |
Jason Yan <yanaijie@huawei.com> |
scsi: libsas: make the lldd_port_deformed method optional Now LLDDs have to implement lldd_port_deformed method otherwise NULL dereference will happen. Make it optional and remove the dummy implementation in hisi_sas. Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> CC: Hannes Reinecke <hare@suse.com> Acked-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1c09b663 |
|
18-Jul-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add memory barrier in task delivery function In task start delivery function, we need to add a memory barrier to prevent re-ordering of reading memory by hardware. Because the slot data is set in task prepare function and it could be running in another CPU. This patch adds an memory barrier after s->ready is read in the task start delivery function, and uses WRITE_ONCE() in the places where s->ready is set to ensure that the compiler does not re-order. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6cca51ee |
|
18-Jul-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Tidy hisi_sas_task_prep() To decrease the usage of spinlock during delivery IO, relocate some code in hisi_sas_task_prep(). Also an invalid comment is removed. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4522204a |
|
18-Jul-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: tidy host controller reset function a bit This patch tidies host controller reset function by putting some code to two new functions, and exports these two functions out, so that they could be used by FLR feature to be realised. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4e32b2f4 |
|
18-Jul-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Drop hisi_sas_slot_abort() For some time now we have not used hisi_sas_slot_abort() to handle erroring slots, apart from in archaic v1 hw. As such, remove this function and associated code. For v1 hw, move error handling to same scheme as other hw revisions, where we allow erroring commands to timeout. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ce70c2e6 |
|
31-May-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Add missing PHY spinlock init The init is missed for hisi_sas_phy spinlock, so add it. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2ba5afb6 |
|
31-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Pre-allocate slot DMA buffers Currently the driver spends much time allocating and freeing the slot DMA buffer for command delivery/completion. To boost the performance, pre-allocate the buffers for all IPTT. The downside of this approach is that we are reallocating all buffer memory upfront, so hog memory which we may not need. However, the current method - DMA buffer pool - also caches all buffers and does not free them until the pool is destroyed, so is not exactly efficient either. On top of this, since the slot DMA buffer is slightly bigger than a 4K page, we need to allocate 2x4K pages per buffer (for 4K page kernel), which is quite wasteful. For 64K page size this is not such an issue. So, for the 4K page case, in order to make memory usage more efficient, pre-allocating larger blocks of DMA memory for the buffers can be more efficient. To make DMA memory usage most efficient, we would choose a single contiguous DMA memory block, but this could use up all the DMA memory in the system (when CMA enabled and no IOMMU), or we may just not be able to allocate a DMA buffer large enough when no CMA or IOMMU. To decide the block size we use the LCM (least common multiple) of the buffer size and the page size. We roundup(64) to ensure the LCM is not too large, even though a little memory may be wasted per block. So, with this, the total memory requirement is about is about 17MB for 4096 max IPTT. Previously (for 4K pages case), it would be 32MB (for all slots allocated). With this change, the relative increase of IOPS for bs=4K read when PAGE_SIZE=4K and PAGE_SIZE=64K is as follows: IODEPTH 4K PAGE_SIZE 64K PAGE_SIZE 32 56% 47% 64 53% 44% 128 64% 43% 256 67% 45% Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f2ae8d04 |
|
31-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Release all remaining resources in clear nexus ha In host reset, we use TMF or soft-reset to re-init device, and if success, we will release all LLDD resources of this device. If the init fails - maybe because the device was removed or link has not come up - then do not release the LLDD resources, but rather rely on SCSI EH to handle the timeout for these resources later on. But if clear nexus ha calls host reset, which is the last effort of SCSI EH, we should release all LLDD remain resources. Because SCSI EH will release all tasks after clear nexus ha. Before release, we do I_T nexus reset to try to clear target remain IOs. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ed99e1d9 |
|
31-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Add a flag to filter PHY events during reset During reset, we don't want PHY events reported to libsas for PHYs which were previously attached prior to reset. So check hisi_hba->flags for HISI_SAS_RESET_BIT to filter PHY events during reset. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
214e702d |
|
31-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Adjust task reject period during host reset After soft_reset() for host reset, we should not be allowed to send commands to the HW before the PHYs have come up and the port ids have been refreshed. Prior to this point, any commands cannot be successfully completed. This exclusion is achieved by grabbing the host reset semaphore. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d2fc401e |
|
31-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Fix the conflict between dev gone and host reset There is a possible conflict when a device is removed and host reset occurs concurrently. The reason is that then the device is notified as gone, we try to clear the ITCT, which is notified via an interrupt. The dev gone function pends on this event with a completion, which is completed when the ITCT interrupt occurs. But host reset will disable all interrupts, the wait_for_completion() may wait indefinitely. This patch adds an semaphore to synchronise this two processes. The semaphore is taken by the host reset as the basis of synchronising. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4e63ac82 |
|
31-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Use dmam_alloc_coherent() This patch replaces the usage of dma_alloc_coherent() with the managed version, dmam_alloc_coherent(), hereby reducing replicated code. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by; John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3e1fb1b8 |
|
21-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Mark PHY as in reset for nexus reset When issuing a nexus reset for directly attached device, we want to ignore the PHY down events so libsas will not deform and reform the port. In the case that the attached SAS changes for the reset, libsas will deform and form a port. For scenario that the PHY does not come up after a timeout period, then report the PHY down to libsas. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d87e72fb |
|
21-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Fix return value when get_free_slot() failed It is an step of executing task to get free slot. If the step fails, we will cleanup LLDD resources and should return failure to upper layer or internal caller to abort task execution of this time. But in the current code, the caller of get_free_slot() doesn't return failure when get_free_slot() failed. This patch is to fix it. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
31709548 |
|
21-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Terminate STP reject quickly for v2 hw For v2 hw, STP link from target is rejected after host reset because of a SoC bug. The STP reject will be terminated after we have sent IO from each PHY of a port. This is not an problem before, as we don't need to setup STP link from target immediately after host reset. But now, it is. Because we want to send soft-reset immediately after host reset. In order to terminate STP reject quickly, this patch send ATA reset command through each PHY of a port. Notes: ATA reset command don't need target's response. Besides, we do abort dev for each device before terminating STP reject. This is a quirk of v2 hw. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
78bd2b4f |
|
21-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot In future scenarios we will want to use the TMF struct for more task types than SSP. As such, we can add struct hisi_sas_tmf_task directly into struct hisi_sas_slot, and this will mean we can remove the TMF parameters from the task prep functions. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a865ae14 |
|
21-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Try wait commands before before controller reset We may reset the controller in many scenarios, such as SCSI EH and HW errors. There should be no IO which returns from target when SCSI EH is active. But for other scenarios, there may be. It is not necessary to make such IOs fail. This patch adds an function of trying to wait for any commands, or IO, to complete before host reset. If no more CQ returned from host controller in 100ms, we assume no more IO can return, and then stop waiting. We wait 5s at most. The HW has a register CQE_SEND_CNT to indicate the total number of CQs that has been reported to driver. We can use this register and it is reliable to resd this register in such scenarios that require host reset. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6175abde |
|
21-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: Init disks after controller reset After the controller is reset, it is possible that the disks attached still have outstanding IO to complete. Thus, when the PHYs come back up after controller reset, it is possible that these IOs complete at some unknown point later. We want to ensure that all IOs are complete after the controller reset so that all associated IPTT and other resources can be recycled safely. To achieve this, re-init the disks by TMF or softreset (in case of ATA devices). If the init fails - maybe because the device was removed or link has not come up - then do not release the device resources, but rather rely on SCSI EH to handle the timeout for these resources later on. This patch also does some cleanup to hisi_sas_init_disk(), including removing superfluous cases in the switch statement. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
235bfc7f |
|
21-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Create a scsi_host_template per HW module When a SCSI host is registered, the SCSI mid-layer takes a reference to a module in Scsi_host.hostt.module. In doing this, we are prevented from removing the driver module for the host in dangerous scenario, like when a disk is mounted. Currently there is only one scsi_host_template (sht) for all HW versions, and this is the main.c module. So this means that we can possibly remove the HW module in this dangerous scenario, as SCSI mid-layer is only referencing the main.c module. To fix this, create a sht per module, referencing that same module to create the Scsi host. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d5a60dfd |
|
21-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Reset disks when discovered When a disk is discovered, it may be in an error state, or there may be residual commands remaining in the disk. To ensure any disk is in good state after discovery, reset via TMF (for SAS disk) or softreset (for a SATA disk). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1b865185 |
|
21-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Change common allocation mode of device id To reduce possibility of hitting unknown SoC bugs and aid debugging and test, change allocation mode of device id from last used device id instead of lowest available index. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fa3be0f2 |
|
21-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: change slot index allocation mode Currently we find the lowest available empty bit in the IPTT bitmap to allocate the IPTT for a command. To reduce possibility of hitting unknown SoC bugs and also aid in the debugging of those same bugs, change the allocation mode. The next allocation method is to use the next free slot adjacent to the most recently allocated slot, in a round-robin fashion. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
757db2da |
|
21-May-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate() There is much common code and functionality between the HW versions to set the PHY linkrate. As such, this patch factors out the common code into a generic function hisi_sas_phy_set_linkrate(). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
eb217359 |
|
26-May-2018 |
Wei Yongjun <weiyongjun1@huawei.com> |
scsi: hisi_sas: fix a typo in hisi_sas_task_prep() Fix a typo in hisi_sas_task_prep(). Fixes: 7eee4b921822 ("scsi: hisi_sas: relocate smp sg map") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2f6bca20 |
|
09-May-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add check of device in hisi_sas_task_exec() Currently we don't check that device is not gone before dereferencing its elements in the function hisi_sas_task_exec() (specifically, the DQ pointer). This patch fixes this issue by filling in the DQ pointer in hisi_sas_task_prep() after we check that the device pointer is still safe to reference. [mkp: typo] Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e85d93b2 |
|
09-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Use device lock to protect slot alloc/free The IPTT of a slot is unique, and we currently use hisi_hba lock to protect it. Now slot is managed on hisi_sas_device.list, so use DQ lock to protect for allocating and freeing the slot. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fa222db0 |
|
09-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Don't lock DQ for complete task sending Currently we lock the DQ to protect whole delivery process. So this stops us building slots for the same queue in parallel, and can affect performance. To optimise it, only lock the DQ during special periods, specifically when allocating a slot from the DQ and when delivering a slot to the HW. This approach is now safe, thanks to the previous patches to ensure that we always deliver a slot to the HW once allocated. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3de0026d |
|
09-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: allocate slot buffer earlier Currently we allocate the slot's memory buffer after allocating the DQ slot. To aid DQ lockout reduction, and allow slots to be built in parallel, move this step (which can fail) prior to allocating the slot. Also a stray spin_unlock_irqrestore() is removed from internal task exec function. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a2b3820b |
|
09-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: make return type of prep functions void Since the task prep functions now should not fail, adjust the return types to void. In addition, some checks in the task prep functions are relocated to the main module; this is specifically the check for the number of elements in an sg list exceeded the HW SGE limit. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7eee4b92 |
|
09-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: relocate smp sg map Currently we use DQ lock to protect delivery of DQ entry one by one. To optimise to allow more than one slot to be built for a single DQ in parallel, we need to remove the DQ lock when preparing slots, prior to delivery. To achieve this, we rearrange the slot build order to ensure that once we allocate a slot for a task, we do cannot fail to deliver the task. In this patch, we rearrange the slot building for SMP tasks to ensure that sg mapping part (which can fail) happens before we allocate the slot in the DQ. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c2c1d9de |
|
02-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: update PHY linkrate after a controller reset After the controller is reset, we currently may not honour the PHY max linkrate set via sysfs, in that after a reset we always revert to max linkrate of 12Gbps, ignoring the value set via sysfs. This patch modifies to policy to set the programmed PHY linkrate, honouring the max linkrate programmed via sysfs. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6f7c32d6 |
|
02-May-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: stop controller timer for reset We should only have the timer enabled after PHY up after controller reset, so disable prior to reset. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c6ef8954 |
|
02-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task() It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in special scenario when the device has been removed. If an SMP task times-out, it will call hisi_sas_abort_task() to recover. And currently there is a check in hisi_sas_abort_task() to avoid the situation of processing the abort for the removed device. However we have an ordering problem, in that we may reference a task for the removed device before checking if the device has been removed. Fix this by only referencing the sas_dev after we know it is still present. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cd938e53 |
|
02-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: check host frozen before calling "done" function When the host is frozen in SCSI EH state, at any point after the LLDD sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free the task; see sas_scsi_find_task(). This puts the LLDD in a difficult position, in that once it sets SAS_TASK_STATE_DONE for the task state it should not reference the sas_task again. But the LLDD needs will check the sas_task indirectly in calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done() (to check if the host is frozen state actually). And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after task->task_done() is called (as the sas_task is free'd at this point). This situation would seem to be a problem made by libsas. To work around, check in the LLDD whether the host is in frozen state to ensure it is ok to call task->task_done() function. If in the frozen state, we rely on SCSI EH and libsas to free the sas_task directly. We do not do this for the following IO types: - SMP - they are managed in libsas directly, outside SCSI EH - Any internally originated IO, for similar reason Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b81b6cce |
|
02-May-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice If the SCSI host enters EH, any pending IO will be processed by SCSI EH. However it is possible that SCSI EH will try to abort the IO and also at the same time the IO completes in the driver. In this situation there is a small chance of freeing the sas_task twice. Then if another IO re-uses freed sas_task before the second time of free'ing sas_task, it is possible to free incorrect sas_task. To avoid this situation, add some checks to increase reliability. The sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually protect the LLDD and libsas freeing the task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c90a0bea |
|
23-Mar-2018 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: remove some unneeded structure members This patch removes unneeded structure elements: - hisi_sas_phy.dev_sas_addr: only ever written - Also remove associated function which writes it, hisi_sas_init_add(). - hisi_sas_device.attached_phy: only ever written - Also remove code to set it in hisi_sas_dev_found() Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3ff0f0b6 |
|
23-Mar-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: consolidate command check in hisi_sas_get_ata_protocol() Currently we check the fis->command value in 2 locations in hisi_sas_get_ata_protocol() switch statement. Fix this by consolidating the check for fis->command value to 1 location only. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4f4e21b8 |
|
23-Mar-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: use dma_zalloc_coherent() This is a warning coming from Coccinelle, and need to use new interface dma_zalloc_coherent() instead of dma_alloc_coherent()/memset(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5df41af4 |
|
23-Mar-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: delete timer when removing hisi_sas driver Delete timer for v1 and v3 hw when removing hisi_sas driver. Signed-off-by: Xiang chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
edafeef4 |
|
07-Mar-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Code cleanup and minor bug fixes The patch does some code cleanup and fixes some small bugs: - Correct return status of phy_up_v3_hw() and phy_bcast_v3_hw() - Add static for function phy_get_max_linkrate_v3_hw() - Change exception return status when no reset method - Change magic value to ts->stat in slot_complete_vx_hw() - Remove unnecessary check for dev_is_sata() - Fix some issues of alignment and indents (Authored by Xiaofei Tan in another patch, but added here to be practical) Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6bf6db51 |
|
07-Mar-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix return value of hisi_sas_task_prep() It is an implicit regulation that error code that function returned should be negative. But hisi_sas_task_prep() doesn't follow this. This may cause problems in the upper layer code. For example, in sas_expander.c of libsas, smp_execute_task_sg() may return the number of bytes of underrun. It will be conflicted with the scenaio lldd_execute_task() return an positive error code. This patch change the return value from SAS_PHY_DOWN to -ECOMM in hisi_sas_task_prep(). Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
36996a1e |
|
07-Mar-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req The structure element hisi_sas_devices.running_req to count how many commands are active is in effect only ever written in the code, so remove it. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bb9abc4a |
|
07-Mar-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: increase timer expire of internal abort task The current 110ms expiry time is not long enough for the internal abort task. The reason is that the internal abort task could be blocked in HW if the HW is retrying to set up link. The internal abort task will be executed only when the retry process finished. The maximum time is 5s for the retry of setting up link. So, the timer expire should be more than 5s. This patch increases it from 110ms to 6s. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
eba8c20c |
|
07-Mar-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix the issue of link rate inconsistency In sysfs, there are two files about minimum linkrate, and also two files for maximum linkrate. Take maximum linkrate example, maximum_linkrate_hw is read-only and indicated by the register HARD_PHY_LINKRATE, and maximum_linkrate is read-write and corresponding to the register PROG_PHY_LINK_RATE. But in the function phy_up_v*_hw(), we get *_linkrate value from HARD_PHY_LINKRATE. It is not right. This patch is to fix this issue. Unreferenced PHY-interrupt enum is also removed for v3 hw. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0d762b3a |
|
17-Jan-2018 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix a bug in hisi_sas_dev_gone() When device gone, NULL pointer can be accessed in free_device callback if during SAS controller reset as we clear structure sas_dev prior. Actually we can only set dev_type as SAS_PHY_UNUSED and not clear structure sas_dev as all the members of structure sas_dev will be re-initialized after device found. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6379c560 |
|
17-Jan-2018 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: directly attached disk LED feature for v2 hw This patch implements LED feature of directly attached disk for v2 hw. As libsas has provided an interface lldd_write_gpio() for this feature, we just need realise the interface following SPGIO API. We use an CPLD to finish the hardware part of this feature, and the base address of CPLD should be configured through ACPI or DT tables. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1e15feac |
|
11-Jan-2018 |
Wei Yongjun <weiyongjun1@huawei.com> |
scsi: hisi_sas: make local symbol host_attrs static Fixes the following sparse warning: drivers/scsi/hisi_sas/hisi_sas_main.c:1691:25: warning: symbol 'host_attrs' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
468f4b8d |
|
28-Dec-2017 |
chenxiang <chenxiang66@hisilicon.com> |
scsi: hisi_sas: Change frame type for SET MAX commands According to ATA protocol, SET MAX commands belong to different frame types. So judge features field of SET MAX commands to decide which frame type they belongs to. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8eea9dd8 |
|
08-Dec-2017 |
Jason Yan <yanaijie@huawei.com> |
scsi: libsas: make the event threshold configurable Add a sysfs attr that LLDD can configure it for every host. We made an example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Acked-by: John Garry <john.garry@huawei.com> #for hisi_sas part Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4d0951ee |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add v3 hw suspend and resume For v3 hw SAS, it supports configuring power state from D0 to D3 for entering Low Power status and power state from D3 to D0 for quit Low Power status. When power state from D0 to D3, HW will send FLR to clear the registers of ECAM and BAR space, and when power state from D3 to D0, it will clear the registers of ECAM space only. So when suspend, need to do like controller reset (including disable interrupts/DQ/PHY/BUS), and also release slots after FLR. When resume, re-config the registers of BAR space. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
336bd78b |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: re-add the lldd_port_deformed() In function sas_suspend_devices(), it requires callback lldd_port_deformed callback to be implemented if lldd_port_deformed is implemented. So add a stub for lldd_port_deformed. Callback lldd_port_deformed was not required as the port deformation is done elsewhere in the LLDD. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
9960a24a |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO This patch fix SAS_QUEUE_FULL problem. The test situation is close port while running IO. In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of the device if lldd_I_T_nexus_reset() return TMF_RESP_FUNC_COMPLETE or -ENODEV. But in our SAS driver, we only free slots of the device when the return value is TMF_RESP_FUNC_COMPLETE. So if the return value is -ENODEV, the slot resource will not free any more. As an solution, we should also free slots of the device in lldd_I_T_nexus_reset() if the return value is -ENODEV. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2a038131 |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add internal abort dev in some places We should do internal abort dev before TMF_ABORT_TASK_SET and TMF_LU_RESET. Because we may only have done internal abort for single IO in the earlier part of SCSI EH process. Even the internal abort to the single IO, we also don't know whether it is successful. Besides, we should release slots of the device in hisi_sas_abort_task_set() if the abort is successful. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
813709f2 |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: judge result of internal abort Normally, hardware should ensure that internal abort timeout will never happen. If happen, it would be an SoC failure. What's more, HW will not process any other commands if an internal abort hasn't return CQ, and they will time out also. So, we should judge the result of internal abort in SCSI EH, if it is failed, we should give up to do TMF/softreset and return failure to the upper layer directly. This patch do following things to achieve this: 1. When internal abort timeout happened, we set return value to -EIO in hisi_sas_internal_task_abort(). 2. If prep_abort() is not support, let hisi_sas_internal_task_abort() return TMF_RESP_FUNC_FAILED. 3. If hisi_sas_internal_task_abort() return an negative number, it can be thought that it not executed properly or internal abort timeout. Then we won't do behind TMF or softreset, and return failure directly. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
057c3d1f |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: do link reset for some CHL_INT2 ints We should do link reset of PHY when identify timeout or STP link timeout. They are internal events of SOC and are notified to driver through interrupts of CHL_INT2. Besides, we should add an delay work to do link reset as it needs sleep. So, this patch add an new PHY event HISI_PHYE_LINK_RESET for this. Notes: v2 HW doesn't report the event of STP link timeout. So, we only need to handle event of identify timeout for v2 HW. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e537b62b |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: use an general way to delay PHY work Use an general way to do delay work for a PHY. Then it will be easier to add new delayed work for a PHY in future. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f1c88211 |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add some print to enhance debugging Add some print at some places such as error info and cq of exception IO, device found etc, and also adjust some log levels. All this to assist debugging ability. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e402acdb |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add an mechanism to do reset work synchronously Sometimes it is required to know when the controller reset has completed and also if it has completed successfully. For such places, we call hisi_sas_controller_reset() directly before. That may lead to multiple calls to this function. This patch create a per-reset structure which contains a completion structure and status flag to know when the reset completes and also the status. It is also in hisi_hba.wq to do reset work. As all host reset works are done in hisi_hba.wq, we don't worry multiple calls to hisi_sas_controller_reset(). Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f8e45ec2 |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: modify hisi_sas_dev_gone() for reset Do a couple of changes for when HISI_SAS_RESET_BIT is set for HBA: - Clearing ITCT is not necessary - Remove internal abort as it will fail during reset Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fb51e7a8 |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: some optimizations of host controller reset This patch do following optimizations to host controller reset: 1. Unblock scsi requests before rescanning topology, as SCSI command need be used if new device is found during rescanning topology. 2. Remove drain_workqueue(hisi_hba->wq) and drain_workqueue(shost->work_q), as there is no need to ensure that all PHYs event are done before exiting host reset. 3. Improve message print level of host reset. Host reset is an important and very few occurrence event. We should know its progress even when not debugging. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a669bdbf |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: optimise port id refresh function Currently refreshing the PHY port id after reset is done in the rescan topology function, which is quite late in the reset process. It could be moved earlier in the process, as the port id can be refreshed once the PHYs become ready. In addition to this, we should set the hisi_sas_dev port id to 0xff (invalid port id) if all PHYs of this port remain down for the same device. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0258141a |
|
08-Dec-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: relocate clearing ITCT and freeing device In certain scenarios we may just want to clear the ITCT for a device, and not free other resources like the SATA bitmap using in v2 hw. To facilitate this, this patch relocates the code of clearing ITCT from free_device() to a new hw interface clear_itct(). Then for some hw, we should not realise free_device() if there's nothing left to do for it. [mkp: typo] Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
dc1e4730 |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix dma_unmap_sg() parameter For function dma_unmap_sg(), the <nents> parameter should be number of elements in the scatterlist prior to the mapping, not after the mapping. Fix this usage. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
39bade0c |
|
08-Dec-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: initialize dq spinlock before use It is required to initialize the dq spinlock before use, which was not being done, so fix it. This issue can be detected when CONFIG_DEBUG_SPINLOCK is enabled. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
841b86f3 |
|
23-Oct-2017 |
Kees Cook <keescook@chromium.org> |
treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts With all callbacks converted, and the timer callback prototype switched over, the TIMER_FUNC_TYPE cast is no longer needed, so remove it. Conversion was done with the following scripts: perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \ $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u) perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \ $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u) The now unused macros are also dropped from include/linux/timer.h. Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
77570eed |
|
22-Aug-2017 |
Kees Cook <keescook@chromium.org> |
scsi: sas: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This requires adding a pointer to hold the timer's target task, as there isn't a link back from slow_task. Cc: John Garry <john.garry@huawei.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Jack Wang <jinpu.wang@profitbricks.com> Cc: lindar_liu@usish.com Cc: Jens Axboe <axboe@fb.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Cc: Baoyou Xie <baoyou.xie@linaro.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: John Garry <john.garry@huawei.com> # for hisi_sas part Tested-by: John Garry <john.garry@huawei.com> # basic sanity test for hisi_sas Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
|
#
571295f8 |
|
24-Oct-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: complete all tasklets prior to host reset The CQ event is handled in tasklet context, and it could be delayed if the system loading is high. It is possible to run into some problems when executing a host reset when cq_tasklet_vx_hw() is being executed. So, prior to host reset, execute tasklet_kill() to ensure that all CQ tasklets are complete. Besides, as the function hisi_sas_wait_tasklets_done() is added to do tasklet_kill(), this patch refactors some code where tasklet_kill() is used. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b4241f0f |
|
24-Oct-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add hisi_hba.rst_work init for v3 hw Add init code of hisi_hba->rst_work for v3 hw. Because v3 hw also need it to recover controller when some hw errors occurs. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6ba0fbc3 |
|
24-Oct-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix the risk of freeing slot twice The function hisi_sas_slot_task_free() is used to free the slot and do tidy-up of LLDD resources. The LLDD generally should know the state of a slot and decide when to free it, and it should only be done once. For some scenarios, we really don't know the state, like when TMF timeout. In this case, we check task->lldd_task before calling hisi_sas_slot_task_free(). However, we may miss some scenarios when we should also check task->lldd_task, and it is not SMP safe to check task->lldd_task as we don't protect it within spin lock. This patch is to fix this risk of freeing slot twice, as follows: 1. Check task->lldd_task in the hisi_sas_slot_task_free(), and give up freeing of this time if task->lldd_task is NULL. 2. Set slot->buf to NULL after it is freed. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
378c233b |
|
24-Oct-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix NULL check in SMP abort task path This patch adds a NULL check of task->lldd_task before freeing the slot in SMP path. This is to guard against the scenario of the slot being freed during the from the preceding internal abort. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1eb8eeac |
|
24-Oct-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: us start_phy in PHY_FUNC_LINK_RESET When a PHY_FUNC_LINK_RESET is issued, we need to fill the transport identify_frame to SAS controller before the PHYs are enabled. Without this, we may find that if a PHY which belonged to a wideport before the reset may generate a new port id. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3297ded1 |
|
24-Oct-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix SATA breakpoint memory size Currently the size of memory we allocate for SATA breakpoint buffer is incorrect. The breakpoint memory size should be as follows: 32 (NCQ tags) * 128 * 2048 (max #devs) = 8MB Currently we only allocate 0.5MB, but get away with it as we never have SATA device index > 128 typically. To conserve precious DMA memory (8MB may not be even available), limit the number of devices per HBA to 1024, which means 4MB of memory required for SATA breakpoint. The 1024 device limit applied to all HW versions. For v3 hw, we need to configure this value. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
9feaf909 |
|
24-Oct-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: grab hisi_hba.lock when processing slots When adding/removing slots from device list, we need to lock this operation with hisi_hba lock for safety. This patch adds missing instances of this. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
302e0901 |
|
24-Oct-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: use spin_lock_irqsave() for hisi_hba.lock We used spin_lock() to grab hisi_hba.lock in two places where spin_lock_irqsave() should be used, as hisi_hba.lock can be taken in interrupt context. This patch is to fix this. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f692a677 |
|
24-Oct-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix internal abort slot timeout bug When an internal abort times out in hisi_sas_internal_task_abort(), goto the exit label in and not go through the other task status checks. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7dad1691 |
|
26-Sep-2017 |
Colin Ian King <colin.king@canonical.com> |
scsi: libsas: remove unused variable sas_ha Remove unused variable sas_ha to clean up build warning "unused variable sas_ha [-Wunused-variable]" Fixes: 042ebd293b86 ("scsi: libsas: kill useless ha_event and do some cleanup") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
042ebd29 |
|
06-Sep-2017 |
Jason Yan <yanaijie@huawei.com> |
scsi: libsas: kill useless ha_event and do some cleanup The ha_event now has only one event HAE_RESET, and this event does nothing. Kill it and do some cleanup. This is a preparation for enhance libsas hotplug feature in the next patches. Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cc199e78 |
|
25-Aug-2017 |
Hannes Reinecke <hare@suse.de> |
scsi: libsas: move bus_reset_handler() to target_reset_handler() The bus reset handler is calling I_T Nexus reset, which logically is a target reset as it need to specify both the initiator and the target. So move it to target reset. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
30b67de3 |
|
10-Aug-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: remove driver versioning The driver version is not updated with changes to the driver, so it has no value, so just get rid of it. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
76aae5f6 |
|
10-Aug-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: replace kfree with scsi_host_put Instances of kfree(shost) should be replaced with scsi_host_put(). In addition, a missing scsi_host_put() is added for error path in hisi_sas_shost_alloc_pci() and v3 driver removal. Signed-off-by: Pan Bian <bianpan2016@163.com> # For main.c changes Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a25d0d3d |
|
10-Aug-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add reset handler for v3 hw Use ACPI "_RST" method to reset the controller, since FLR is not supported. Function hisi_sas_stop_phys() is introduced to remove some code duplication. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
031da09c |
|
10-Aug-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add status and command buffer for internal abort For v3 hw, internal abort function required status and command buffer to be set, so add necessary code for this. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c3fe8a2b |
|
10-Aug-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: support zone management commands Add two ATA commands, ATA_CMD_ZAC_MGMT_IN and ATA_CMD_ZAC_MGMT_OUT in hisi_sas_get_ata_protocol(), to support SATA SMR disk. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cef4e1ab |
|
10-Aug-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: remove repeated device config in v2 hw This patch removes some repeated configurations: (1) The device id of the device is already set in the alloc function, so we don't need to modify in free device function. (2) Field dev_type and dev_status are configured in hisi_sas_dev_gone(), so there is no need for repeated config in free_device_v3_hw. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c52108c6 |
|
10-Aug-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: add v2 hw DFX feature Add DFX feature for v2 hw. We are adding support for the following errors: - loss_of_dword_sync_count - invalid_dword_count - phy_reset_problem_count - running_disparity_error_count Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
917d3bda |
|
10-Aug-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix reset and port ID refresh issues This patch provides fixes for the following issues: 1. Fix issue of controller reset required to send commands. For reset process, it may be required to send commands to the controller, but not during soft reset. So add HISI_SAS_NOT_ACCEPT_CMD_BIT to prevent executing a task during this period. 2. Send a broadcast event in rescan topology to detect any topology changes during reset. 3. Previously it was not ensured that libsas has processed the PHY up and down events after reset. Potentially this could cause an issue that we still process the PHY event after reset. So resolve this by flushing shot workqueue in LLDD reset. 4. Port ID requires refresh after reset. The port ID generated after reset is not guaranteed to be the same as before reset, so it needs to be refreshed for each device's ITCT. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f557e32c |
|
29-Jun-2017 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: optimise DMA slot memory Currently we allocate 3 sets of DMA memories from separate pools for each slot. This is inefficient in terms of memory usage (buffers are less than 1 page in size, so we lose due to alignment), and also time spent in doing 3 allocations + de-allocations per slot, instead of 1. To optimise, combine the 3 DMA buffers into a single buffer from a single pool. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d30ff263 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: modify internal abort dev flow for v3 hw There is a change for abort dev for v3 hw: add registers to configure unaborted iptt for a device, and then inform this to logic. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e21fe3a5 |
|
14-Jun-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add initialisation for v3 pci-based controller Add the code to initialise the controller which is based on pci device in hisi_sas_v3_hw.c The core controller routines are still in hisi_sas_main.c; some common initialisation functions are also exported from hisi_sas_main.c For pci-based controller, the device properties, like phy count and sas address are read from the firmware, same as platform device-based controller. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0fa24c19 |
|
14-Jun-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: create hisi_sas_get_fw_info() Move the functionality to retrieve the fw info into a dedicated device type-agnostic function, hisi_sas_get_fw_info(). The reasoning is that this function will be required for future pci-based platforms. Also add some debug logs for failure. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
11b75249 |
|
14-Jun-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add pci_dev in hisi_hba struct Since hip08 SAS controller is based on pci device, add hisi_hba.pci_dev for hip08 (will be v3), and also rename hisi_hba.pdev to .platform_dev for clarity. In addition, for common code which wants to reference the controller device struct, add hisi_hba.dev, and change the common code to use it. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
318913c6 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: relocate get_ncq_tag_v2_hw() Relocate get_ncq_tag_v2_hw() to a common location, as future hw versions will require it. Also rename with "hisi_sas_" prefix for consistency. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
75904077 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: relocate sata_done_v2_hw() Relocate get_ata_protocol() to a common location, as future hw versions will require it. Also rename with "hisi_sas_" prefix for consistency. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6c7bb8a1 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: relocate get_ata_protocol() Relocate get_ata_protocol() to a common location, as future hw versions will require it. Also rename with "hisi_sas_" prefix for consistency. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b1a49412 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: optimise the usage of hisi_hba.lock Currently hisi_hba.lock is locked to deliver and receive a command to/from any hw queue. This causes much contention at high data-rates. To boost performance, lock on a per queue basis for sending and receiving commands to/from hw. Certain critical regions still need to be locked in the delivery and completion stages with hisi_hba.lock. New element hisi_sas_device.dq is added to store the delivery queue for a device, so it does not need to be needlessly re-calculated for every task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ad604832 |
|
14-Jun-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: define hisi_sas_device.device_id as int Currently hisi_sas_device.device_id is a u64. This can create a problem in selecting the queue for a device, in that this code does a 64b division on device id. For some 32b systems, 64b division is slow and the lib reference must be explicitly included. The device id does not need to be 64b in size, so, as a solution, just make as an int. Also, struct hisi_sas_device elements are re-ordered to improve packing efficiency. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f64a6988 |
|
14-Jun-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix timeout check in hisi_sas_internal_task_abort() We need to check for timeout before task status, or the task will be mistook as completed internal abort command. Also add protection for sas_task.task_state_flags in hisi_sas_tmf_timedout(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
eb045e04 |
|
22-May-2017 |
Gustavo A. R. Silva <garsilva@embeddedor.com> |
scsi: hisi_sas: add null check before indirect pointer dereference Add null check before indirectly dereferencing pointer task->lldd_task in statement u32 tag = slot->idx; Addresses-Coverity-ID: 1373843 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c5ce0abe |
|
21-Apr-2017 |
Johannes Thumshirn <jthumshirn@suse.de> |
scsi: sas: move scsi_remove_host call into sas_remove_host Move scsi_remove_host call into sas_remove_host and remove it from SAS HBA drivers, so we don't mess up the ordering. This solves an issue with double deleting sysfs entries that was introduced by the change of sysfs behaviour from commit bcdde7e221a8 ("sysfs: make __sysfs_remove_dir() recursive"). [mkp: addressed checkpatch complaints] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Suggested-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Jinpu Wang <jinpu.wang@profitbricks.com> Cc: John Garry <john.garry@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jinpu Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d3c4dd4e |
|
10-Apr-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: fix NULL deference when TMF timeouts If a TMF timeouts (maybe due to unlikely scenario of an expander being unplugged when TMF for remote device is active), when we eventually try to free the slot, we crash as we dereference the slot's task, which has already been released. As a fix, add checks in the slot release code for a NULL task. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0844a3ff |
|
10-Apr-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add v2 hw internal abort timeout workaround This patch is a workaround for a SoC bug where an internal abort command may timeout. In v2 hw, the channel should become idle in order to finish abort process. If the target side has been sending HOLD, host side channel failed to complete the frame to send, and can not enter the idle state. Then internal abort command will timeout. As this issue is only in v2 hw, we deal with it in the hw layer. Our workaround solution is: If abort is not finished within a certain period of time, we will check HOLD status. If HOLD has been sending, we will send break command. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6073b771 |
|
22-Mar-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: use dev_is_sata to identify SATA or SAS disk When SMP IO is sent, sas_protocol_ata couldn't judge whether the disk is SATA or SAS disk. So use dev_is_sata to identify SATA or SAS disk. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
14d3f397 |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: check hisi_sas_lu_reset() error message Unless we actually get some sort of failure in hisi_sas_lu_reset(), don't print a message. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ccbfe5a0 |
|
22-Mar-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: release SMP slot in lldd_abort_task When an SMP task timeouts, it will call lldd_abort_task to release the associated slot, and then will release the sas_task. Currently in lldd_abort_task, if we fail to internally abort IO, then the slot of SMP IO is not released, but sas_task will still be later released, so the slot's sas_task is NULL, which will cause NULL pointer when hisi_sas_slot_task_free happens later. To resolve, check the return value of internal abort, and release the slot if it failed. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8b05ad6a |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add hisi_sas_clear_nexus_ha() Add function for upper-layer to reset controller when all else fails. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6fcdda80 |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: remove task free'ing for timeouts When a TMF or internal abort times-out, do not free slot. We expect this to be done upon later escalated error handling. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
54c9dd2d |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: fix some sas_task.task_state_lock locking Some more locking needs to be added/modified for when read-modify-writing sas_task.task_state_flags. Note: since we can attempt to grab this lock in interrupt context we should use irq variant of spin_lock. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6131243a |
|
22-Mar-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: free slots after hardreset After hardreset, we clear up IOs of remote disks, so we need to free those slots in LLDD. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
055945df |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: hardreset for SATA disk in LU reset When issuing an LU reset for a SATA target, issue an internal abort and a hard reset. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c35279f2 |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: modify hisi_sas_abort_task() for SSP Currently an internal abort is executed regardless of the result of the TMF. We should also check the result of the internal abort to see if we should free the slot. So change the status code STAT_IO_COMPLETE to TMF_RESP_FUNC_SUCC, meaning the slot has been successfully aborted. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b4c67a6c |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: only reset link for PHY_FUNC_LINK_RESET We currently do a hard reset for a link reset. Change this to do a link reset only. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ddabca21 |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: error hisi_sas_task_prep() when port down When sas_port is NULL, then return SAS_PHY_DOWN. In addition, when the sas_dev is gone then explicitly return SAS_PHY_DOWN. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
405314df |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: remove hisi_sas_port_deformed() Currently when a root PHY is deformed from a asd_sas_port we try to release the slots in the LLDD, and fail. Regardless, it is not right to release this early. This patch removes the deformed function. As it was before, port deformation is still done in hisi_sas_phy_down(). It would be nice to actually remove the hisi_sas_port_{de}formed() pair, however we cannot as we need to know the asd_sas_port index libsas has associated with an asd_sas_phy. The hw does actually generate a port id for a PHY, but this seems to a random number, so ignored for this purpose. This patch also changes the code to link slots to the hisi_sas_device, and not hisi_sas_port. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7c594f04 |
|
22-Mar-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add softreset function for SATA disk Add softreset to clear IO after internal abort device for SATA disk. The SATA error handling for the controller is based on device internal abort and softreset function. The controller does not support internal abort for single IO, so we need to execute internal abort for device. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
396b8044 |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: move PHY init to hisi_sas_scan_start() Relocate the PHY init code from LLDD hw init path to hisi_sas_scan_start(). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
06ec0fb9 |
|
22-Mar-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add controller reset There are some scenarios that we need to warm-reset to reset registers of SAS controller. During reset we disable interrupts/DQs/PHYs, and after reset we re-init the hardware and rescan the topology to see if anything changed. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2e244f0f |
|
22-Mar-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add to_hisi_sas_port() Introduce function to get hisi_sas_port from asd_sas_port. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
13c59906 |
|
20-Jan-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: decrease running_req in hisi_sas_slot_task_free() There is an issue that hisi_sas_dev.running_req is not decremented properly for internal abort and TMF. To resolve, only decrease running_req in hisi_sas_slot_task_free() Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0757f041 |
|
20-Jan-2017 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix probe ordering problem There is a potential probe issue in how we trigger the hw initialisation. Although we use 1s timer to delay hw initialisation, there is still a potential that sas_register_ha() is not be finished before we start the PHY init from hw->hw_init(). To avoid this issue, initialise the hw after sas_register_ha() in the same probe context. Note: it is not necessary to use 1s timer now (modified v2 hw only). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
297d7302 |
|
20-Jan-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: downgrade internal abort exit print Downgrade the exit print in hisi_sas_internal_task_abort() to dbg level, as info is not required. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
87e287c1 |
|
20-Jan-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: downgrade refclk message The message to inform that the controller has no refclk is currently at warning level, which is unnecessary, so downgrade to debug. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
da7b66e7 |
|
03-Jan-2017 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: lock sensitive region in hisi_sas_slot_abort() When we call hisi_sas_slot_task_free() we should grab the hisi_hba.lock, as hisi_sas_slot_task_free() accesses common hisi_hba elements. Function hisi_sas_slot_abort() is missing this, so add it. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d37a0082 |
|
29-Nov-2016 |
Xiaofei Tan <tanxiaofei@huawei.com> |
scsi: hisi_sas: fix free'ing in probe and remove This patch addresses 4 problems in the module probe/remove: - When hisi_sas_shost_alloc() fails after we alloc shost memory, we should free shost memory before the function returns. - When hisi_sas_probe() fails after we alloc the HBA memories, we should also free the HBA memories. - We should free shost memory at the end of hisi_sas_remove(). - sha->core.shost is set twice, so remove extra set. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Quentin Lambert <lambert.quentin@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2ae75787 |
|
07-Nov-2016 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: add PHY set linkrate support for v1 and v2 hw Add the function to set PHY min and max linkrate through sysfs interface. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
f696cc32 |
|
07-Nov-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: use atomic64_t for hisi_sas_device.running_req Sometimes the value of hisi_sas_device.running_req would go negative unless we have the check for running_req >= 0 before trying to decrement. This is because using running_req is not thread-safe. As such, the value for running_req may be actually incorrect, so use atomic64_t instead. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
997ee43c |
|
07-Nov-2016 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: modify return value of hisi_sas_query_task() sas_scsi_find_task() only deals with return value TMF_RESP_FUNC_FAILED/TMF_RESP_FUNC_SUCC/TMF_RESP_FUNC_COMPLETE of query task. So for LLDD errors just return TMF_RESP_FUNC_FAILED. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d2d7e7a0 |
|
07-Nov-2016 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: replace WARN_ON() with dev_warn() for internal abort Replace WARN_ON() with dev_warn() print when internal abort fails. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1d7e9469 |
|
07-Nov-2016 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: fix port form bug in hisi_sas_port_notify_formed() When we form a wideport, we should use hardware PHY port_id instead of sas_phy->id. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
c70f1fb7 |
|
07-Nov-2016 |
Xiang Chen <chenxiang66@hisilicon.com> |
scsi: hisi_sas: alloc queue id of slot according to device id Currently slots are allocated from queues in a round-robin fashion. This causes a problem for internal commands in device mode. For this mode, we should ensure that the internal abort command is the last command seen in the host for that device. We can only ensure this when we place the internal abort command after the preceding commands for device that in the same queue, as there is no order in which the host will select a queue to execute the next command. This queue restriction makes supporting scsi mq more tricky in the future, but should not be a blocker. Note: Even though v1 hw does not support internal abort, the allocation method is chosen to be the same for consistency. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3bc45af8 |
|
04-Oct-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: Add v2 hw support for different refclk The hip06 D03 and hip07 D05 boards have different reference clock frequencies for the SAS controller. Register PHY_CTRL needs to be programmed differently according to this frequency, so add support for this. The default register setting in PHY_CTRL is for 50MHz, so only update this register when the refclk frequency is 66MHz. For ACPI we expect the _RST handler to set the correct value for PHY_CTRL (we're forced to take different approach for DT and ACPI as ACPI does not support fixed-clock device). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a6f2c7ff |
|
06-Sep-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: set dma mask before allocate DMA memory The device DMA mask was being set after the bulk of the DMA allocations in the driver init, so potentially DMA allocates fail. To resolve, relocate before allocating the DMA memory when initialising the driver. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
09fe9ecb |
|
06-Sep-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: fix a potential warning for sata disk ejection If hisi_sas_task_prep() fails for a SATA device due to PHY down, we return a failure to libata and also call task_done(), which will cause ata_qc_complete() to be called twice: - first call from hisi_sas_task_prep(), which will clear flag ATA_QCFLAG_ACTIVE - ata_qc_complete() called from libata The warning call trace is as follows: [ 117.070206] [<ffff0000084f59b0>] __ata_qc_complete+0xf4/0x11c [ 117.070208] [<ffff0000084f5b58>] ata_qc_complete+0x180/0x200 [ 117.070210] [<ffff0000084f5dd0>] ata_qc_issue+0x110/0x354 [ 117.070212] [<ffff0000084f6254>] ata_exec_internal_sg+0x240/0x4d0 [ 117.070214] [<ffff0000084f6544>] ata_exec_internal+0x60/0xa0 [ 117.070217] [<ffff000008501580>] ata_read_log_page+0x188/0x1b4 [ 117.070218] [<ffff0000085017dc>] ata_eh_analyze_ncq_error+0xa8/0x274 [ 117.070220] [<ffff000008501a3c>] ata_eh_link_autopsy+0x94/0x8c8 [ 117.070222] [<ffff0000085022a4>] ata_eh_autopsy+0x34/0xe8 [ 117.070223] [<ffff00000850540c>] ata_do_eh+0x28/0xc0 [ 117.070225] [<ffff0000085054e0>] ata_std_error_handler+0x3c/0x84 [ 117.070227] [<ffff000008505140>] ata_scsi_port_error_handler+0x480/0x674 [ 117.070230] [<ffff0000084e3020>] async_sas_ata_eh+0x44/0x78 [ 117.070231] [<ffff0000080d6b8c>] async_run_entry_fn+0x40/0x104 [ 117.070234] [<ffff0000080ce518>] process_one_work+0x128/0x2f0 [ 117.070235] [<ffff0000080ce738>] worker_thread+0x58/0x434 [ 117.070237] [<ffff0000080d416c>] kthread+0xd4/0xe8 [ 117.070240] [<ffff000008084e10>] ret_from_fork+0x10/0x40 The issue is resolved by simply returning a failure status code to the upper layer. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
433f5696 |
|
06-Sep-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: use safe BITS_PER_BYTE for slot tag size calculation The memory calculation for the tags bitmap should use BITS_PER_BYTE macro instead of coincidental same value of sizeof(unsigned long). Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
59ba49f9 |
|
06-Sep-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: only zero slot memory when reused Currently the slot memory is zeroed when it is freed and also when it is reused, like in hisi_sas_task_prep(). Optimise by avoiding the redundant zeroing in the free. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4fde02ad |
|
06-Sep-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: save delivery queue write pointer Optimise by saving an avoidable read in the get_free_slot function. The delivery queue write pointer will only be updated by software, so don't bother re-reading what was already written in the previous call to start_delivery function. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4ffde482 |
|
24-Aug-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add TMF success check When a tmf is issued, various response codes can be returned from the target. For a query tmf the response may be TMF_RESP_FUNC_COMPLETE or TMF_RESP_FUNC_SUCC. Add a condition for TMF_RESP_FUNC_SUCC to hisi_sas_exec_internal_tmf_task(). This affects query tmf, as the result is success the returned value was for failure. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
9859f24e |
|
24-Aug-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: fail tmf task prep when port detached When the port is detached we cannot execute a TMF, as there can be no device attached to the port. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
dc8a49ca |
|
24-Aug-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add internal abort to hisi_sas_abort_task() Execute an internal abort for executing a task abort. This is for case of the command still being present in host when abort is executed. For a SATA internal abort, we set abort for all tasks associated with the device. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
40f2702b |
|
24-Aug-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add internal abort in hisi_sas_dev_gone() Execute an internal abort for that device when it is removed, so that commands for that device are not processed. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
441c2740 |
|
24-Aug-2016 |
John Garry <john.garry@huawei.com> |
scsi: hisi_sas: add internal abort main code Add main code for internal abort functionality. The internal abort features allows the host controller to abort commands which are still active in the controller but have not yet been sent to the slave device. Typically a command only spends a relatively short time in the controller when compared to the amount of the time after it is sent to the slave device. Two modes of internal abort are supported: - device - individual command For device, when the internal abort is issued all commands in the host for that device are aborted. For a single command, only that command is aborted if it is still in the host. In HW the internal abort command is executed similar to any other sort of command, like SSP. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
685b6d6e |
|
15-Apr-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: add device and slot alloc hw methods Add methods to use HW specific versions of functions to allocate slot and device. HW specific methods are permitted to workaround device id vs IPTT collision issue in v2 hw. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
31eec8a6 |
|
25-Feb-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: add hisi_sas_slave_configure() In high-datarate aging tests, it is found that the SCSI framework can periodically issue lu resets as some commands timeout. Response TASK SET FULL and SAS_QUEUE_FULL may be returned many times for the same command, causing the timeouts. The SAS_QUEUE_FULL errors come from TRANS_TX_CREDIT_TIMEOUT_ERR, TRANS_TX_CLOSE_NORMAL_ERR, and TRANS_TX_ERR_FRAME_TXED errors. They do not mean that the queue is full in the host, but rather it is equivalent to meaning the queue is full for the sdev. To overcome this, the queue depth for the sdev is reduced to 64 (from 256, set in sas_slave_configure()). Normally error code SAS_QUEUE_FULL will result in the sdev queue depth falling, but it falls too slowly during high-datarate tests and commands timeout before it has fallen to an adequete level from original value. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cac9b2a2 |
|
25-Feb-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: add hisi_sas_slot_abort() Add a function to abort a slot (task) in the target device and then cleanup and complete the task. The function is called from work queue context as it cannot be called from the context where it is triggered (interrupt). Flag hisi_sas_slot.abort is added as the flag used in the slot error handler to indicate whether the slot needs to be aborted in the sdev prior to cleanup and finish. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
1af1b808 |
|
25-Feb-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: change tmf func complete check In hisi_sas_exec_internal_tmf_task(), the check for SAM_STAT_GOOD is replaced with TMF_RESP_FUNC_COMPLETE, which is a genuine tmf response code. SAM_STAT_GOOD and TMF_RESP_FUNC_COMPLETE have the same value, so this is why it worked before. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4d558c77 |
|
03-Feb-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: use Unified Device Properties API The hisi_sas driver is required to support both device tree and ACPI. The scanning of the device properties now uses the Unified Device Properties API, which serves both OF and ACPI. Since syscon is not supported by ACPI, syscon is only used in the driver when device tree is used. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6f2ff1a1 |
|
25-Jan-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: add v2 path to send ATA command Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a8d547bd |
|
25-Jan-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: set max commands as configurable Since v2 hardware permits different numbers of commands to v1, set this as configurable in hisi_sas_hw. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
98bf39fc |
|
25-Jan-2016 |
John Garry <john.garry@huawei.com> |
hisi_sas: relocate DEV_IS_EXPANDER Relocate DEV_IS_EXPANDER to hisi_sas.h as it will be required for v2 hw support. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
9c9d18e7 |
|
09-Dec-2015 |
Dan Carpenter <dan.carpenter@oracle.com> |
hisi_sas: fix error codes in hisi_sas_task_prep() There were a couple cases where the error codes weren't set and also I changed the success return to "return 0;" which is the same as "return rc;" but more explicit. Fixes: 42e7a69368a5 ('hisi_sas: Add ssp command functio') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8c77dca0 |
|
19-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Remove dependency on of_irq_count Originally the driver would use of_irq_count to calculate how much memory is required for storing the interrupt names, since the number of interrupt sources for the controller is variable. Since of_irq_count cannot be used by the driver, use fixed names. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e4189d53 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add control phy handler Add method for lldd_control_phy. Currently link rate control and spinup hold is unsupported. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0efff300 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add tmf methods Add function methods for tmf's. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
701f75ec |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add scan finished and start Add functions for scsi host template scan_finished and scan_start methods. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
66ee999b |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add smp protocol support Add support for smp function, which allows devices attached by expander to be controlled. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
184a4635 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add abnormal irq handler Add abnormal irq handler. This handler is concerned with phy down event. Also add port formed and port deformed handlers. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
abda97c2 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add dev_found and dev_gone Add functions to deal with lldd_dev_found and lldd_dev_gone. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
27a3f229 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add cq interrupt handler Add cq interrupt handler and also slot error handler function. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
42e7a693 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add ssp command function Add path to send ssp command to HW. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
66139921 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add path from phyup irq to SAS framework Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8ff1d571 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add v1 hardware initialisation code Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fa42d80d |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add timer and spinlock init Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
976867e6 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add phy and port init Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
af740dbe |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add hisi sas device type Include initialisation. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7e9080e1 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add hisi_hba workqueue Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
50cb916f |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Set dev DMA mask Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5d74242e |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add phy SAS ADDR initialization The SAS address for the HBA comes from the device tree. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
9101a079 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add cq structure initialization Each completion queue has a structure. This is mainly for passing to irq handler so we know which queue the irq occured on. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
257efd1f |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add slot init code Add functionality to init slot indexing. Slot indexing is for the host to track which slots (or tags) are free and which are used. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
89d53322 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add hisi_sas_remove This patch also includes relevant memory/pool freeing and sas/scsi host removal. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6be6de18 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Allocate memories and create pools Allocate DMA and non-DMA memories for the controller. Also create DMA pools. These include: - Delivery queues - Completion queues - Command status buffer - Command table - ITCT (For device context) - Host slot info - IO status - Breakpoint - host slot indexing - SG data - FIS - interrupts names Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e26b2f40 |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Scan device tree Scan the device tree for all properties. Also do this: - do ioremap for SAS registers - allocate memory for interrupt names Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
7eb7869f |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add scsi host registration Add functionality to register device as a scsi host. The SAS domain transport ops are empty at this point. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e8899fad |
|
17-Nov-2015 |
John Garry <john.garry@huawei.com> |
hisi_sas: Add initial bare main driver This patch adds the initial bare main driver for the HiSilicon SAS HBA. This only introduces the changes to build and load the main driver module. The complete driver consists of the core main module and also a module platform driver for driving the hw. The HBA is a platform device. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|