#
addf1372 |
|
10-Sep-2020 |
Julian Wiedmann <jwi@linux.ibm.com> |
scsi: zfcp: Use list_first_entry_or_null() in zfcp_erp_thread() Use the right helper to avoid poking around in the list's internals. Link: https://lore.kernel.org/r/ed669555c73aab95b29444c10066f492c0c43391.1599765652.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
b43cdb5a |
|
03-Jul-2020 |
Julian Wiedmann <jwi@linux.ibm.com> |
scsi: zfcp: Clean up zfcp_erp_action_ready() We already maintain a pointer to act->adapter. Use it consistently to avoid any confusion about whose ->erp_ready_head and ->erp_ready_wq we are accessing. Link: https://lore.kernel.org/r/d1bb04322f240dee32f4c4a551bc93bc736f4b01.1593780621.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
936e6b85 |
|
23-Jun-2020 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d0dff2ac |
|
08-May-2020 |
Benjamin Block <bblock@linux.ibm.com> |
scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data At the moment we allocate and register the Scsi_Host object corresponding to a zfcp adapter (FCP device) very early in the life cycle of the adapter - even before we fully discover and initialize the underlying firmware/hardware. This had the advantage that we could already use the Scsi_Host object, and fill in all its information during said discover and initialize. Due to commit 737eb78e82d5 ("block: Delay default elevator initialization") (first released in v5.4), we noticed a regression that would prevent us from using any storage volume if zfcp is configured with support for DIF or DIX (zfcp.dif=1 || zfcp.dix=1). Doing so would result in an illegal memory access as soon as the first request is sent with such an configuration. As example for a crash resulting from this: scsi host0: scsi_eh_0: sleeping scsi host0: zfcp qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36 Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: ... CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1 Hardware name: ... Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc] Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000 000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000 000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000 00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0 Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10) 000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4 #000003ff801fcda8: e310b2900004 lg %r1,656(%r11) >000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1) 000003ff801fcdb4: e310b2900004 lg %r1,656(%r11) 000003ff801fcdba: 41201018 la %r2,24(%r1) 000003ff801fcdbe: e32010000024 stg %r2,0(%r1) 000003ff801fcdc4: b904002b lgr %r2,%r11 Call Trace: [<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod] ([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod]) [<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680 [<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8 [<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160 [<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228 [<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0 [<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8 [<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90 [<00000000349c1856>] blk_execute_rq+0x6e/0xb0 [<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod] [<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod] [<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod] [<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod] [<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc] [<0000000034245ce8>] process_one_work+0x458/0x7d0 [<00000000342462a2>] worker_thread+0x242/0x448 [<0000000034250994>] kthread+0x15c/0x170 [<0000000034e1979c>] ret_from_fork+0x30/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod] Kernel panic - not syncing: Fatal exception: panic_on_oops While this issue is exposed by the commit named above, this is only by accident. The real issue exists for longer already - basically since it's possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests for a tag-set during initialization of the same. For a given Scsi_Host object this is done when adding the object to the midlayer (`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer calculates how much memory is required for a single scsi_cmnd, and its additional data, which also might include space for additional protection data - depending on whether the Scsi_Host has any form of protection capabilities (`scsi_host_get_prot()`). The problem is now thus, because zfcp does this step before we actually know whether the firmware/hardware has these capabilities, we don't set any protection capabilities in the Scsi_Host object. And so, no space is allocated for additional protection data for requests in the Scsi_Host tag-set. Once we go through discover and initialize the FCP device firmware/hardware fully (this is done via the firmware commands "Exchange Config Data" and "Exchange Port Data") we find out whether it actually supports DIF and DIX, and we set the corresponding capabilities in the Scsi_Host object (in `zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection capabilities, but the already allocated requests in the tag-set don't have any space allocated for that. When we then trigger target scanning or add scsi_devices manually, the midlayer will use requests from that tag-set, and before sending most requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd this function will check again whether the used Scsi_Host has any protection capabilities - and now it potentially has - and if so, it will try to initialize the assumed to be preallocated structures and thus it causes the crash, like shown above. Before delaying the default elevator initialization with the commit named above, we always would also allocate an elevator for any scsi_device before ever sending any requests - in contrast to now, where we do it after device-probing. That elevator in turn would have its own tag-set, and that is initialized after we went through discovery and initialization of the underlying firmware/hardware. So requests from that tag-set can be allocated properly, and if used - unless the user changes/disabled the default elevator - this would hide the underlying issue. To fix this for any configuration - with or without an elevator - we move the allocation and registration of the Scsi_Host object for a given FCP device to after the first complete discovery and initialization of the underlying firmware/hardware. By doing that we can make all basic properties of the Scsi_Host known to the midlayer by the time we call `scsi_add_host()`, including whether we have any protection capabilities. To do that we have to delay all the accesses that we would have done in the past during discovery and initialization, and do them instead once we are finished with it. The previous patches ramp up to this by fencing and factoring out all these accesses, and make it possible to re-do them later on. In addition we make also use of the diagnostic buffers we recently added with commit 92953c6e0aa7 ("scsi: zfcp: signal incomplete or error for sync exchange config/port data") commit 7e418833e689 ("scsi: zfcp: diagnostics buffer caching and use for exchange port data") commit 088210233e6f ("scsi: zfcp: add diagnostics buffer for exchange config data") (first released in v5.5), because these already cache all the information we need for that "re-do operation" - the information cached are always updated during xconf or xport data, so it won't be stale. In addition to the move and re-do, this patch also updates the function-documentation of `zfcp_scsi_adapter_register()` and changes how it reports if a Scsi_Host object already exists. In that case future recovery-operations can skip this step completely and behave much like they would do in the past - zfcp does not release a once allocated Scsi_Host object unless the corresponding FCP device is deconstructed completely. Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
971f2abb |
|
08-May-2020 |
Benjamin Block <bblock@linux.ibm.com> |
scsi: zfcp: Fence adapter status propagation for common statuses Common status flags that all main objects - adapter, port, and unit - support are propagated to sub-objects when set or cleared. For instance, when setting the status ZFCP_STATUS_COMMON_ERP_INUSE for an adapter object, we will propagate this to all its child ports and units - same for when clearing a common status flag. Units of an adapter object are enumerated via __shost_for_each_device() over the scsi host object of the corresponding adapter. Once we move the scsi host object allocation and registration to after the first exchange config and exchange port data, this won't be possible for cases where we set or clear common statuses during the very first adapter recovery. But since we won't have any port or unit objects yet at that point of time, we can just fence the status propagation for cases where the scsi host object is not yet set in the adapter object. It won't change any effective status propagations, but will prevent us from dereferencing invalid pointers. For any later point in the work flow the scsi host object will be set and thus nothing is changed then. Link: https://lore.kernel.org/r/f51fe5f236a1e3d1ce53379c308777561bfe35e1.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
ac007adc |
|
08-May-2020 |
Benjamin Block <bblock@linux.ibm.com> |
scsi: zfcp: Move p-t-p port allocation to after xport data When doing the very first adapter recovery - initialization - for a FCP device in a point-to-point topology we also allocate the port object corresponding to the attached remote port, and trigger a port recovery for it that will run after the adapter recovery finished. Right now this happens right after we finished with the exchange config data command, and uses the fibre channel host object corresponding to the FCP device to determine whether a point-to-point topology is used. When moving the scsi host object allocation and registration - and thus also the fibre channel host object allocation - to after the first exchange config and exchange port data, this use of the fc_host object is not possible anymore at that point in the work flow. But the allocation and recovery trigger doesn't have notable side-effects on the following exchange port data processing, so we can move those to after xport data, and thus also to after the scsi host object allocation, once we move it. Then the fc_host object can be used again, like it is now. For any further adapter recoveries this doesn't change anything, because at that point the port object already exists and recovery is triggered elsewhere for existing port objects. Link: https://lore.kernel.org/r/73e5d4ac21e2b37bf0c3ca8e530bc5a5c6e74f8f.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cec9cbac |
|
31-Mar-2020 |
Joe Perches <joe@perches.com> |
scsi: zfcp: use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/ Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> [bblock@linux.ibm.com: resolved merge conflict with recently upstream-sent patch "zfcp: expose fabric name as common fc_host sysfs attribute"] Link: https://lore.kernel.org/r/d14669a67a17392490d3184117941123765db1a4.1585663010.git.bblock@linux.ibm.com Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
819732be |
|
12-Mar-2020 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point v2.6.27 commit cc8c282963bd ("[SCSI] zfcp: Automatically attach remote ports") introduced zfcp automatic port scan. Before that, the user had to use the sysfs attribute "port_add" of an FCP device (adapter) to add and open remote (target) ports, even for the remote peer port in point-to-point topology. That code path did a proper port open recovery trigger taking the erp_lock. Since above commit, a new helper function zfcp_erp_open_ptp_port() performed an UNlocked port open recovery trigger. This can race with other parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt e.g. adapter->erp_total_count or adapter->erp_ready_head. As already found for fabric topology in v4.17 commit fa89adba1941 ("scsi: zfcp: fix infinite iteration on ERP ready list"), there was an endless loop during tracing of rport (un)block. A subsequent v4.18 commit 9e156c54ace3 ("scsi: zfcp: assert that the ERP lock is held when tracing a recovery trigger") introduced a lockdep assertion for that case. As a side effect, that lockdep assertion now uncovered the unlocked code path for PtP. It is from within an adapter ERP action: zfcp_erp_strategy[1479] intentionally DROPs erp lock around zfcp_erp_strategy_do_action() zfcp_erp_strategy_do_action[1441] NO erp lock zfcp_erp_adapter_strategy[876] NO erp lock zfcp_erp_adapter_strategy_open[855] NO erp lock zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock zfcp_erp_adapter_strat_fsf_xconf[772] erp lock only around zfcp_erp_action_to_running(), BUT *_not_* around zfcp_erp_enqueue_ptp_port() zfcp_erp_enqueue_ptp_port[728] BUG: *_not_* taking erp lock _zfcp_erp_port_reopen[432] assumes to be called with erp lock zfcp_erp_action_enqueue[314] assumes to be called with erp lock zfcp_dbf_rec_trig[288] _checks_ to be called with erp lock: lockdep_assert_held(&adapter->erp_lock); It causes the following lockdep warning: WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288 zfcp_dbf_rec_trig+0x16a/0x188 no locks held by zfcperp0.0.17c0/775. Fix this by using the proper locked recovery trigger helper function. Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com Fixes: cc8c282963bd ("[SCSI] zfcp: Automatically attach remote ports") Cc: <stable@vger.kernel.org> #v2.6.27+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
e76acc51 |
|
25-Oct-2019 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act No functional change. The unary not operator only applies to the sub expression before the logical or. So we return early if (not running) or failed. Link: https://lore.kernel.org/r/df4f897f6e83eaa528465d0858d5a22daac47a2f.1572018132.git.bblock@linux.ibm.com Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
48464708 |
|
02-Jul-2019 |
Benjamin Block <bblock@linux.ibm.com> |
scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 | struct zfcp_erp_action *erp_action; | ^~~~~~~~~~ This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN. zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action *erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented. We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value. BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit 399b6c8bc9f7 ("scsi: zfcp: drop old default switch case which might paper over missing case"). Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
242ec145 |
|
26-Mar-2019 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices Suppose more than one non-NPIV FCP device is active on the same channel. Send I/O to storage and have some of the pending I/O run into a SCSI command timeout, e.g. due to bit errors on the fibre. Now the error situation stops. However, we saw FCP requests continue to timeout in the channel. The abort will be successful, but the subsequent TUR fails. Scsi_eh starts. The LUN reset fails. The target reset fails. The host reset only did an FCP device recovery. However, for non-NPIV FCP devices, this does not close and reopen ports on the SAN-side if other non-NPIV FCP device(s) share the same open ports. In order to resolve the continuing FCP request timeouts, we need to explicitly close and reopen ports on the SAN-side. This was missing since the beginning of zfcp in v2.6.0 history commit ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter."). Note: The FSF requests for forced port reopen could run into FSF request timeouts due to other reasons. This would trigger an internal FCP device recovery. Pending forced port reopen recoveries would get dismissed. So some ports might not get fully reopened during this host reset handler. However, subsequent I/O would trigger the above described escalation and eventually all ports would be forced reopen to resolve any continuing FCP request timeouts due to earlier bit errors. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: <stable@vger.kernel.org> #3.0+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
fe67888f |
|
26-Mar-2019 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host An already deleted SCSI device can exist on the Scsi_Host and remain there because something still holds a reference. A new SCSI device with the same H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created. When we try to unblock an rport, we still find the deleted SCSI device and return early because the zfcp_scsi_dev of that SCSI device is not ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if the new proper SCSI device would be in good state. Therefore, skip deleted SCSI devices when iterating the sdevs of the shost. [cf. __scsi_device_lookup{_by_target}() or scsi_device_get()] The following abbreviated trace sequence can indicate such problem: Area : REC Tag : ersfs_3 LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x40000000 not ZFCP_STATUS_COMMON_UNBLOCKED Ready count : n not incremented yet Running count : 0x00000000 ERP want : 0x01 ERP need : 0xc1 ZFCP_ERP_ACTION_NONE Area : REC Tag : ersfs_3 LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x41000000 Ready count : n+1 Running count : 0x00000000 ERP want : 0x01 ERP need : 0x01 ... Area : REC Level : 4 only with increased trace level Tag : ertru_l LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x40000000 Request ID : 0x0000000000000000 ERP status : 0x01800000 ERP step : 0x1000 ERP action : 0x01 ERP count : 0x00 NOT followed by a trace record with tag "scpaddy" for WWPN 0x50050763031bd327. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery") Cc: <stable@vger.kernel.org> #2.6.32+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
399b6c8b |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: drop old default switch case which might paper over missing case This was introduced with v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") but would now suppress helpful -Wswitch compiler warnings when building with W=1 such as the following forced example: drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_setup_act': drivers/s390/scsi/zfcp_erp.c:220:2: warning: enumeration value 'ZFCP_ERP_ACTION_REOPEN_PORT' not handled in switch [-Wswitch] switch (need) { ^~~~~~ But then again, only with W=1 we would notice unhandled enum cases. Without the default cases and a missed unhandled enum case, the code might perform unforeseen things we might not want... As of today, we never run through the removed default case, so removing it is no functional change. In the future, we never should run through a default case but introduce the necessary specific case(s) to handle new functionality. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0c902936 |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: drop default switch case which might paper over missing case This was introduced with v4.18 commit 8c3d20aada70 ("scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED") but would now suppress helpful -Wswitch compiler warnings when building with W=1 such as the following forced example: drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_handle_failed': drivers/s390/scsi/zfcp_erp.c:126:2: warning: enumeration value 'ZFCP_ERP_ACTION_REOPEN_PORT_FORCED' not handled in switch [-Wswitch] switch (want) { ^~~~~~ But then again, only with W=1 we would notice unhandled enum cases. Without the default cases and a missed unhandled enum case, the code might perform unforeseen things we might not want... As of today, we never run through the removed default case, so removing it is no functional change. In the future, we never should run through a default case but introduce the necessary specific case(s) to handle new functionality. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
3505144e |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: silence -Wimplicit-fallthrough in zfcp_erp_lun_strategy() For some reason the already existing substring "fall through" in the comment is not sufficient for GCC to silence -Wimplicit-fallthrough. CC [M] drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_lun_strategy': drivers/s390/scsi/zfcp_erp.c:1065:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (atomic_read(&zfcp_sdev->status) & ZFCP_STATUS_COMMON_OPEN) ^ drivers/s390/scsi/zfcp_erp.c:1068:2: note: here case ZFCP_ERP_STEP_LUN_CLOSING: ^~~~ Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8684d614 |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: silence all W=1 build warnings for existing kdoc While at it also improve some copy & paste kdoc mistakes. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d5fcdced |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: use enum zfcp_erp_act_result for argument/return of affected functions With that instead of just "int" it becomes clear which functions return this type and which ones also accept it as argument they just pass through in some cases or modify in other cases. v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") introduced the enum which was cpp defines previously. Silence some false -Wswitch compiler warning cases with individual NOP cases. When adding more enum values and building with W=1 we would get compiler warnings about missed new cases. Consistently use the variable name "result", so change "retval" in zfcp_erp_strategy() to "result". This avoids confusion with other compile unit variables "retval" having different semantics and type. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
0023beec |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: use enum zfcp_erp_steps for struct zfcp_erp_action.step Use the already defined enum for this purpose to get at least some build checking (even though an enum is type equivalent to an int in C). v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") introduced the enum which was cpp defines previously. Since struct zfcp_erp_action type is embedded into other structures living in zfcp_def.h, we have to move enum zfcp_erp_act_type from its private definition in zfcp_erp.c to the zfcp-global zfcp_def.h Silence some false -Wswitch compiler warning cases with individual NOP cases. When adding more enum values and building with W=1 we would get compiler warnings about missed new cases. Add missing break statements in some of the above switch cases. No functional change, but making it future-proof. I think all of these should have had a break statement ever since, even if these switch cases happened to be the last ones in the switch statement body. "Fall through" in the context of switch case usually means not to have a break and fall through to the subsequent switch case. However, I think this old comment meant that here we do not have an _early return_ in the switch case but the code path continues after the switch case body. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
df91eefd |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: the action field of zfcp_erp_action is actually the type &zfcp_erp_action.action ==> &zfcp_erp_action.type While at it, make use of the already defined enum for this purpose to get at least some build checking (even though an enum is type equivalent to an int in C). v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") introduced the enum which was cpp defines previously. To prevent compiler warnings with the switch(act->type), we have to separate the recently added eyecatchers from enum zfcp_erp_act_type. Since struct zfcp_erp_action type is embedded into other structures living in zfcp_def.h, we have to move enum zfcp_erp_act_type from its private definition in zfcp_erp.c to the zfcp-global zfcp_def.h. Silence one false -Wswitch compiler warning case: LUNs as the leaves in our object tree do not have any follow-up success recovery. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
208d0961 |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: clarify function argument name for trace tag string v2.6.30 commit 5ffd51a5e495 ("[SCSI] zfcp: replace current ERP logging with a more convenient version") changed trace record distinguishing from a numerical ID to a 7 character string called "trace tag". While starting to use function arguments with different type and semantics, it did not change the argument name accordingly. v2.6.38 commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") renamed variable names "id" into "tag" but only within zfcp_dbf.*, not within zfcp_erp.c. This was a bit confusing since the remainder of zfcp does use the term "trace tag". Also "id" is quite generic and it's not obvious for what. Just unify it consistently and use the "dbf" prefix to relate the arguments to the code in zfcp_dbf.*. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
64eba384 |
|
08-Nov-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: ERP thread setup kdoc update zfcp_erp_thread_setup() update complements v2.6.32 commit 347c6a965dc1 ("[SCSI] zfcp: Use kthread API for zfcp erp thread"). Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
35e9111a |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: support SCSI_ADAPTER_RESET via scsi_host sysfs attribute host_reset Make use of feature introduced with v3.2 commit 294436914454 ("[SCSI] scsi: Added support for adapter and firmware reset"). The common code interface was introduced for commit 95d31262b3c1 ("[SCSI] qla4xxx: Added support for adapter and firmware reset"). $ echo adapter > /sys/class/scsi_host/host<N>/host_reset Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : scshr_y SCSI sysfs host_reset yes LUN : 0xffffffffffffffff none (invalid) WWPN : 0x0000000000000000 none (invalid) D_ID : 0x00000000 none (invalid) Adapter status : 0x4500050b Port status : 0x00000000 none (invalid) LUN status : 0x00000000 none (invalid) Ready count : 0x00000001 Running count : 0x00000000 ERP want : 0x04 ZFCP_ERP_ACTION_REOPEN_ADAPTER ERP need : 0x04 ZFCP_ERP_ACTION_REOPEN_ADAPTER This is the common code equivalent to the zfcp-specific &dev_attr_adapter_failed.attr in zfcp_sysfs_adapter_attrs.attrs[]: $ echo 0 > /sys/bus/ccw/drivers/zfcp/<devbusid>/failed The unsupported case returns EOPNOTSUPP: $ echo firmware > /sys/class/scsi_host/host<N>/host_reset -bash: echo: write error: Operation not supported Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : scshr_n SCSI sysfs host_reset no Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0xffffffff none (invalid) SCSI LUN : 0xffffffff none (invalid) SCSI LUN high : 0xffffffff none (invalid) SCSI result : 0xffffffa1 -EOPNOTSUPP==-95 SCSI retries : 0xff none (invalid) SCSI allowed : 0xff none (invalid) SCSI scribble : 0xffffffffffffffff none (invalid) SCSI opcode : ffffffff ffffffff ffffffff ffffffff none (invalid) FCP rsp inf cod: 0xff none (invalid) FCP rsp IU : 00000000 00000000 00000000 00000000 none (invalid) 00000000 00000000 For any other invalid value, common code returns EINVAL without invoking our callback: $ echo foo > /sys/class/scsi_host/host<N>/host_reset -bash: echo: write error: Invalid argument Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
2fdd45fd |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: remove unused return values of ERP trigger functions Since v2.6.27 commit 553448f6c483 ("[SCSI] zfcp: Message cleanup"), none of the callers has been interested any more. Values were not returned consistently in all ERP trigger functions. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
013af857 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: zfcp_erp_action_exists() does only check for running Simplify its signature to return boolean and rename it to zfcp_erp_action_is_running() to indicate its actual unmodified semantics. It has always been used like this since v2.6.0 history commit ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter."). Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
cd4a186a |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: remove unused ERP enum values All constant defines were introduced with v2.6.0 history commit ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.") and refactored into enums with commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c"). ZFCP_STATUS_ERP_DISMISSING and ZFCP_ERP_STEP_FSF_XCONFIG were never used. v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") removed the use of ZFCP_ERP_ACTION_READY on refactoring zfcp_erp_action_exists() to now only check adapter->erp_running_head but no longer adapter->erp_ready_head. The same commit could have changed the function return type from int to "enum zfcp_erp_act_state". ZFCP_ERP_ACTION_READY was never used outside of zfcp_erp_action_exists(). Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d39eda54 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: consistently use function name space prefix I've been mixing up zfcp_task_mgmt_function() [SCSI] and zfcp_fsf_fcp_task_mgmt() [FSF] so often lately that I wanted to fix this. SCSI changes complement v2.6.27 commit f76af7d7e363 ("[SCSI] zfcp: Cleanup of code in zfcp_scsi.c"). While at it, also fixup the other inconsistencies elsewhere. ERP changes complement v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c") which introduced status_change_set(). FC changes complement v2.6.32 commit 6f53a2d2ecae ("[SCSI] zfcp: Apply common naming conventions to zfcp_fc"). by renaming a leftover introduced with v2.6.27 commit cc8c282963bd ("[SCSI] zfcp: Automatically attach remote ports"). FSF changes fixup v2.6.32 commit a4623c467ff7 ("[SCSI] zfcp: Improve request allocation through mempools"). which replaced zfcp_fsf_alloc_qtcb() introduced with v2.6.27 commit c41f8cbddd4e ("[SCSI] zfcp: zfcp_fsf cleanup."). SCSI fc_host statistics were introduced with v2.6.16 commit f6cd94b126aa ("[SCSI] zfcp: transport class adaptations"). SCSI fc_host port_state was introduced with v2.6.27 commit 85a82392fe6f ("[SCSI] zfcp: Add port_state attribute to sysfs"). SCSI rport setter for dev_loss_tmo was introduced with v2.6.18 commit 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable"). Signed-off-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6a765508 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
8c3d20aa |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED That other commit introduced an inconsistency because it would trace on ERP_FAILED for all callers of port forced reopen triggers (not just terminate_rport_io), but it would not trace on ERP_FAILED for all callers of other ERP triggers such as adapter, port regular, LUN. Therefore, generalize that other commit. zfcp_erp_action_enqueue() already had two early outs which re-used the one zfcp_dbf_rec_trig() call. All ERP trigger functions finally run through zfcp_erp_action_enqueue(). So move the special handling for ZFCP_STATUS_COMMON_ERP_FAILED into zfcp_erp_action_enqueue() and add another early out with new trace marker for pseudo ERP need in this case. This removes all early returns from all ERP trigger functions so we always end up at zfcp_dbf_rec_trig(). Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
d70aab55 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED For problem determination we always want to see when we were invoked on the terminate_rport_io callback whether we perform something or not. Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec: loose remote port t workqueue [s] zfcp_q_<dev> IRQ zfcperp<dev> === ================== =================== ============================ 0 recv RSCN q p.test_link_work block rport start fast_io_fail_tmo send ADISC ELS 4 recv ADISC fail block zfcp_port port forced reopen send open port 12 recv open port fail q p.gid_pn_work zfcp_erp_wakeup (zfcp_erp_wait would return) GID_PN fail Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail, e.g. with the typical 5 sec setting. port.status |= ERP_FAILED If fast_io_fail_tmo triggers after this point, we missed a SCSI trace. workqueue fc_dl_<host> ================== 27 fc_timeout_fail_rport_io fc_terminate_rport_io zfcp_scsi_terminate_rport_io zfcp_erp_port_forced_reopen _zfcp_erp_port_forced_reopen if (port.status & ERP_FAILED) return; Therefore, write a trace before above early return. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : sctrpi1 SCSI terminate rport I/O LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> D_ID : 0x<n_port_id> Adapter status : 0x... Port status : 0x... LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
96d92704 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return get_device() and its internally used kobject_get() only return NULL if they get passed NULL as argument. zfcp_get_port_by_wwpn() loops over adapter->port_list so the iteration variable port is always non-NULL. Struct device is embedded in struct zfcp_port so &port->dev is always non-NULL. This is the argument to get_device(). However, if we get an fc_rport in terminate_rport_io() for which we cannot find a match within zfcp_get_port_by_wwpn(), the latter can return NULL. v2.6.30 commit 70932935b61e ("[SCSI] zfcp: Fix oops when port disappears") introduced an early return without adding a trace record for this case. Even if we don't need recovery in this case, for debugging we should still see that our callback was invoked originally by scsi_transport_fc. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : sctrpin SCSI terminate rport I/O, no zfcp port LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> WWPN D_ID : 0x<n_port_id> N_Port-ID Adapter status : 0x... Port status : 0xffffffff unknown (-1) LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: 70932935b61e ("[SCSI] zfcp: Fix oops when port disappears") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
512857a7 |
|
17-May-2018 |
Steffen Maier <maier@linux.ibm.com> |
scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed If a SCSI device is deleted during scsi_eh host reset, we cannot get a reference to the SCSI device anymore since scsi_device_get returns !=0 by design. Assuming the recovery of adapter and port(s) was successful, zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for the half-gone SCSI device. Unfortunately, it causes the following confusing trace record which states that zfcp will do a LUN recovery as "ERP need" is ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want". Old example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ZFCP_STATUS_COMMON_RUNNING but not ZFCP_STATUS_COMMON_UNBLOCKED as it was closed on close part of adapter reopen ERP want : 0x01 ERP need : 0x01 misleading However, zfcp_erp_setup_act() returns NULL as it cannot get the reference. Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery actually happens. We always do want the recovery trigger trace record even if no erp_action could be enqueued as in this case. For other cases where we did not enqueue an erp_action, 'need' has always been zero to indicate this. In order to indicate above goto out, introduce an eyecatcher "flag" to mark the "ERP need" as 'not needed' but still keep the information which erp_action type, that zfcp_erp_required_act() had decided upon, is needed. 0xc_ is chosen to be visibly different from 0x0_ in "ERP want". New example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ERP want : 0x01 ERP need : 0xc1 would need LUN ERP, but no action set up ^ Before v2.6.38 commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") we could detect this case because the "erp_action" field in the trace was NULL. The rework removed erp_action as argument and field from the trace. This patch here is for tracing. A fix to allow LUN recovery in the case at hand is a topic for a separate patch. See also commit fdbd1c5e27da ("[SCSI] zfcp: Allow running unit/LUN shutdown without acquiring reference") for a similar case and background info. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
5c13db9b |
|
17-Oct-2017 |
Steffen Maier <maier@linux.vnet.ibm.com> |
zfcp: purely mechanical update using timer API, plus blank lines erp_memwait only occurs in seldom memory pressure situations. The typical case never uses the associated timer and thus also does not need to initialize the timer. Also, we don't want to re-initialize the timer each time we re-use an erp_action in zfcp_erp_setup_act() [see also v4.14-rc7 commit ab31fd0ce65e ("scsi: zfcp: fix erp_action use-before-initialize in REC action trace") for erp_action life cycle]. Hence, retain the lazy inintialization of zfcp_erp_action.timer in zfcp_erp_strategy_memwait(). Add an empty line after declarations in zfcp_erp_timeout_handler() and zfcp_fsf_request_timeout_handler() even though it was also missing before the timer conversion. Fix checkpatch warning: WARNING: function definition argument 'struct timer_list *' should also have an identifier name +extern void zfcp_erp_timeout_handler(struct timer_list *); Depends-on: v4.14-rc3 commit 686fef928bba ("timer: Prepare to change timer callback argument type") Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
75492a51 |
|
16-Oct-2017 |
Kees Cook <keescook@chromium.org> |
s390/scsi: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Steffen Maier <maier@linux.vnet.ibm.com> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ab31fd0c |
|
13-Oct-2017 |
Steffen Maier <maier@linux.vnet.ibm.com> |
scsi: zfcp: fix erp_action use-before-initialize in REC action trace v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery") extended accessing parent pointer fields of struct zfcp_erp_action for tracing. If an erp_action has never been enqueued before, these parent pointer fields are uninitialized and NULL. Examples are zfcp objects freshly added to the parent object's children list, before enqueueing their first recovery subsequently. In zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action fields can cause a NULL pointer dereference. Since the kernel can read from lowcore on s390, it does not immediately cause a kernel page fault. Instead it can cause hangs on trying to acquire the wrong erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl() ^bogus^ while holding already other locks with IRQs disabled. Real life example from attaching lots of LUNs in parallel on many CPUs: crash> bt 17723 PID: 17723 TASK: ... CPU: 25 COMMAND: "zfcperp0.0.1800" LOWCORE INFO: -psw : 0x0404300180000000 0x000000000038e424 -function : _raw_spin_lock_wait_flags at 38e424 ... #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp] #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp] #2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp] #3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp] #4 [fdde8fe60] kthread at 173550 #5 [fdde8feb8] kernel_thread_starter at 10add2 zfcp_adapter zfcp_port zfcp_unit <address>, 0x404040d600000000 scsi_device NULL, returning early! zfcp_scsi_dev.status = 0x40000000 0x40000000 ZFCP_STATUS_COMMON_RUNNING crash> zfcp_unit <address> struct zfcp_unit { erp_action = { adapter = 0x0, port = 0x0, unit = 0x0, }, } zfcp_erp_action is always fully embedded into its container object. Such container object is never moved in its object tree (only add or delete). Hence, erp_action parent pointers can never change. To fix the issue, initialize the erp_action parent pointers before adding the erp_action container to any list and thus before it becomes accessible from outside of its initializing function. In order to also close the time window between zfcp_erp_setup_act() memsetting the entire erp_action to zero and setting the parent pointers again, drop the memset and instead explicitly initialize individually all erp_action fields except for parent pointers. To be extra careful not to introduce any other unintended side effect, even keep zeroing the erp_action fields for list and timer. Also double-check with WARN_ON_ONCE that erp_action parent pointers never change, so we get to know when we would deviate from previous behavior. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery") Cc: <stable@vger.kernel.org> #2.6.32+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bc46427e |
|
27-Jul-2017 |
Lukáš Korenčik <xkorenc1@fi.muni.cz> |
scsi: zfcp: use setup_timer instead of init_timer Use initialization with setup_timer function instead of using init_timer function and data fields. It improves readability. Signed-off-by: Lukáš Korenčik <xkorenc1@fi.muni.cz> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
6f2ce1c6 |
|
09-Dec-2016 |
Steffen Maier <maier@linux.vnet.ibm.com> |
scsi: zfcp: fix rport unblock race with LUN recovery It is unavoidable that zfcp_scsi_queuecommand() has to finish requests with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time window when zfcp detected an unavailable rport but fc_remote_port_delete(), which is asynchronous via zfcp_scsi_schedule_rport_block(), has not yet blocked the rport. However, for the case when the rport becomes available again, we should prevent unblocking the rport too early. In contrast to other FCP LLDDs, zfcp has to open each LUN with the FCP channel hardware before it can send I/O to a LUN. So if a port already has LUNs attached and we unblock the rport just after port recovery, recoveries of LUNs behind this port can still be pending which in turn force zfcp_scsi_queuecommand() to unnecessarily finish requests with DID_IMM_RETRY. This also opens a time window with unblocked rport (until the followup LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during this time window fc_timed_out() cannot work as desired and such command would indeed time out and trigger scsi_eh. This prevents a clean and timely path failover. This should not happen if the path issue can be recovered on FC transport layer such as path issues involving RSCNs. Fix this by only calling zfcp_scsi_schedule_rport_register(), to asynchronously trigger fc_remote_port_add(), after all LUN recoveries as children of the rport have finished and no new recoveries of equal or higher order were triggered meanwhile. Finished intentionally includes any recovery result no matter if successful or failed (still unblock rport so other successful LUNs work). For simplicity, we check after each finished LUN recovery if there is another LUN recovery pending on the same port and then do nothing. We handle the special case of a successful recovery of a port without LUN children the same way without changing this case's semantics. For debugging we introduce 2 new trace records written if the rport unblock attempt was aborted due to still unfinished or freshly triggered recovery. The records are only written above the default trace level. Benjamin noticed the important special case of new recovery that can be triggered between having given up the erp_lock and before calling zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the following sequence: ERP thread rport_work other context ------------------------- -------------- -------------------------------- port is unblocked, rport still blocked, due to pending/running ERP action, so ((port->status & ...UNBLOCK) != 0) and (port->rport == NULL) unlock ERP zfcp_erp_action_cleanup() case ZFCP_ERP_ACTION_REOPEN_LUN: zfcp_erp_try_rport_unblock() ((status & ...UNBLOCK) != 0) [OLD!] zfcp_erp_port_reopen() lock ERP zfcp_erp_port_block() port->status clear ...UNBLOCK unlock ERP zfcp_scsi_schedule_rport_block() port->rport_task = RPORT_DEL queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task != RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_block() if (!port->rport) return zfcp_scsi_schedule_rport_register() port->rport_task = RPORT_ADD queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task == RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_register() (port->rport == NULL) rport = fc_remote_port_add() port->rport = rport; Now the rport was erroneously unblocked while the zfcp_port is blocked. This is another situation we want to avoid due to scsi_eh potential. This state would at least remain until the new recovery from the other context finished successfully, or potentially forever if it failed. In order to close this race, we take the erp_lock inside zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or LUN. With that, the possible corresponding rport state sequences would be: (unblock[ERP thread],block[other context]) if the ERP thread gets erp_lock first and still sees ((port->status & ...UNBLOCK) != 0), (block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock after the other context has already cleard ...UNBLOCK from port->status. Since checking fields of struct erp_action is unsafe because they could have been overwritten (re-used for new recovery) meanwhile, we only check status of zfcp_port and LUN since these are only changed under erp_lock elsewhere. Regarding the check of the proper status flags (port or port_forced are similar to the shown adapter recovery): [zfcp_erp_adapter_shutdown()] zfcp_erp_adapter_reopen() zfcp_erp_adapter_block() * clear UNBLOCK ---------------------------------------+ zfcp_scsi_schedule_rports_block() | write_lock_irqsave(&adapter->erp_lock, flags);-------+ | zfcp_erp_action_enqueue() | | zfcp_erp_setup_act() | | * set ERP_INUSE -----------------------------------|--|--+ write_unlock_irqrestore(&adapter->erp_lock, flags);--+ | | .context-switch. | | zfcp_erp_thread() | | zfcp_erp_strategy() | | write_lock_irqsave(&adapter->erp_lock, flags);------+ | | ... | | | zfcp_erp_strategy_check_target() | | | zfcp_erp_strategy_check_adapter() | | | zfcp_erp_adapter_unblock() | | | * set UNBLOCK -----------------------------------|--+ | zfcp_erp_action_dequeue() | | * clear ERP_INUSE ---------------------------------|-----+ ... | write_unlock_irqrestore(&adapter->erp_lock, flags);-+ Hence, we should check for both UNBLOCK and ERP_INUSE because they are interleaved. Also we need to explicitly check ERP_FAILED for the link down case which currently does not clear the UNBLOCK flag in zfcp_fsf_link_down_info_eval(). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 8830271c4819 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport") Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors") Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI") Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable") Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again") Cc: <stable@vger.kernel.org> #2.6.32+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
4eeaa4f3 |
|
10-Aug-2016 |
Steffen Maier <maier@linux.vnet.ibm.com> |
zfcp: close window with unblocked rport during rport gone On a successful end of reopen port forced, zfcp_erp_strategy_followup_success() re-uses the port erp_action and the subsequent zfcp_erp_action_cleanup() now sees ZFCP_ERP_SUCCEEDED with erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED but must not perform zfcp_scsi_schedule_rport_register(). We can detect this because the fresh port reopen erp_action is in its very first step ZFCP_ERP_STEP_UNINITIALIZED. Otherwise this opens a time window with unblocked rport (until the followup port reopen recovery would block it again). If a scsi_cmnd timeout occurs during this time window fc_timed_out() cannot work as desired and such command would indeed time out and trigger scsi_eh. This prevents a clean and timely path failover. This should not happen if the path issue can be recovered on FC transport layer such as path issues involving RSCNs. Also, unnecessary and repeated DID_IMM_RETRY for pending and undesired new requests occur because internally zfcp still has its zfcp_port blocked. As follow-on errors with scsi_eh, it can cause, in the worst case, permanently lost paths due to one of: sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk! sd <scsidev>: Device offlined - not ready after error recovery For fix validation and to aid future debugging with other recoveries we now also trace (un)blocking of rports. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 5767620c383a ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED") Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors") Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI") Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable") Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again") Cc: <stable@vger.kernel.org> #2.6.32+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
805de8f4 |
|
23-Apr-2015 |
Peter Zijlstra <peterz@infradead.org> |
atomic: Replace atomic_{set,clear}_mask() usage Replace the deprecated atomic_{set,clear}_mask() usage with the now ubiquous atomic_{or,andnot}() functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
11d83360 |
|
14-Apr-2015 |
David Rientjes <rientjes@google.com> |
mm, mempool: do not allow atomic resizing Allocating a large number of elements in atomic context could quickly deplete memory reserves, so just disallow atomic resizing entirely. Nothing currently uses mempool_resize() with anything other than GFP_KERNEL, so convert existing callers to drop the gfp_mask. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Steffen Maier <maier@linux.vnet.ibm.com> [zfcp] Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
1b33ef23 |
|
13-Nov-2014 |
Martin Peschke <mpeschke@linux.vnet.ibm.com> |
zfcp: remove access control tables interface (port leftovers) This patch removes some leftovers for commit 663e0890e31cb85f0cca5ac1faaee0d2d52880b5 "[SCSI] zfcp: remove access control tables interface". The "access denied" case for ports is gone, as well. The corresponding flag was cleared, but never set. So clean it up. Sysfs flag is kept, though, for backward-compatibility. Now it returns always 0. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
924dd584 |
|
22-Aug-2013 |
Martin Peschke <mpeschke@linux.vnet.ibm.com> |
[SCSI] zfcp: fix schedule-inside-lock in scsi_device list loops BUG: sleeping function called from invalid context at kernel/workqueue.c:2752 in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700 CPU: 1 Not tainted 3.9.3+ #69 Process zfcperp0.0.1700 (pid: 360, task: 0000000075b7e080, ksp: 000000007476bc30) <snip> Call Trace: ([<00000000001165de>] show_trace+0x106/0x154) [<00000000001166a0>] show_stack+0x74/0xf4 [<00000000006ff646>] dump_stack+0xc6/0xd4 [<000000000017f3a0>] __might_sleep+0x128/0x148 [<000000000015ece8>] flush_work+0x54/0x1f8 [<00000000001630de>] __cancel_work_timer+0xc6/0x128 [<00000000005067ac>] scsi_device_dev_release_usercontext+0x164/0x23c [<0000000000161816>] execute_in_process_context+0x96/0xa8 [<00000000004d33d8>] device_release+0x60/0xc0 [<000000000048af48>] kobject_release+0xa8/0x1c4 [<00000000004f4bf2>] __scsi_iterate_devices+0xfa/0x130 [<000003ff801b307a>] zfcp_erp_strategy+0x4da/0x1014 [zfcp] [<000003ff801b3caa>] zfcp_erp_thread+0xf6/0x2b0 [zfcp] [<000000000016b75a>] kthread+0xf2/0xfc [<000000000070c9de>] kernel_thread_starter+0x6/0xc [<000000000070c9d8>] kernel_thread_starter+0x0/0xc Apparently, the ref_count for some scsi_device drops down to zero, triggering device removal through execute_in_process_context(), while the lldd error recovery thread iterates through a scsi device list. Unfortunately, execute_in_process_context() decides to immediately execute that device removal function, instead of scheduling asynchronous execution, since it detects process context and thinks it is safe to do so. But almost all calls to shost_for_each_device() in our lldd are inside spin_lock_irq, even in thread context. Obviously, schedule() inside spin_lock_irq sections is a bad idea. Change the lldd to use the proper iterator function, __shost_for_each_device(), in combination with required locking. Occurences that need to be changed include all calls in zfcp_erp.c, since those might be executed in zfcp error recovery thread context with a lock held. Other occurences of shost_for_each_device() in zfcp_fsf.c do not need to be changed (no process context, no surrounding locking). The problem was introduced in Linux 2.6.37 by commit b62a8d9b45b971a67a0f8413338c230e3117dff5 "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit". Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Cc: stable@vger.kernel.org #2.6.37+ Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
#
663e0890 |
|
26-Apr-2013 |
Martin Peschke <mpeschke@linux.vnet.ibm.com> |
[SCSI] zfcp: remove access control tables interface This patch removes an interface that was used to manage access control tables within the HBA. The patch consequently removes the handling for conditions related to those access control tables, too. That initiator-based access control feature was only needed until the introduction of NPIV and was withdrawn with z10 years ago. It's time to cleanup the corresponding device driver code. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
#
43f60cbd |
|
04-Sep-2012 |
Steffen Maier <maier@linux.vnet.ibm.com> |
[SCSI] zfcp: No automatic port_rescan on events In FC fabrics with large zones, the automatic port_rescan on incoming ELS and any adapter recovery can cause quite some traffic at the very same time, especially if lots of Linux images share an HBA, which is common on s390. This can cause trouble and failures. Fix this by making such port rescans dependent on a user configurable module parameter. The following unconditional automatic port rescans remain as is: On setting an adapter online and on manual user-triggered writes to the sysfs attribute port_rescan. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
#
a53c8fab |
|
20-Jul-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/comments: unify copyright messages and remove file names Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
038d9446 |
|
22-Feb-2011 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Add information to symbolic port name when running in NPIV mode Query the FC symbolic port name for reporting in the fc_host sysfs and enable the symbolic_name attribute in the fc_host sysfs. When running in NPIV mode, extend the symbolic port name with the devno and the hostname. This allows better identification of Linux systems for SAN and storage administrators. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
c7b279ae |
|
22-Feb-2011 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Replace kmem_cache for "status read" data zfcp requires a mempool for the status read data blocks to resubmit the "status read" requests at any time. Each status read data block has the size of a page (4096 bytes) and needs to be placed in one page. Instead of having a kmem_cache for allocating page sized chunks, use mempool_create_page_pool to create a mempool returning pages and remove the zfcp kmem_cache. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
3d63d3b4 |
|
02-Dec-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Move qdio setup from erp to zfcp_qdio.c Initialization of the qdio waitqueue should happen when the qdio data is initialized and the QDIOUP flag should be handled in the qdio code as well. Adjust the code accordingly and remove the superfluos function zfcp_erp_adapter_strategy_open_qdio. Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
ea4a3a6a |
|
02-Dec-2010 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Redesign of the debug tracing final cleanup. This patch is the final cleanup of the redesign from the zfcp tracing. Structures and elements which were used by multiple areas of the former debug tracing are now changed to the new scheme. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
ae0904f6 |
|
02-Dec-2010 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Redesign of the debug tracing for recovery actions. The tracing environment of the zfcp LLD has become very bulky and hard to maintain. Small changes involve a large modification process which is error-prone and not effective. This patch is the first of a set to redesign the zfcp tracing to a more straight-forward and easy to extend scheme. It removes all interpretation and visualization parts and focuses on bare logging of the information. This patch deals with all trace records of the zfcp error recovery. Signed-off-by: Swen schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
14718e3c |
|
17-Nov-2010 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Prevent usage w/o holding a reference The ERP got values assigned for which no reference was taken. This can lead to an unpredictable race condition. Fix this by only assigning the values which are required and for which a reference was pulled or is held implicitly. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
d3e1088d |
|
17-Nov-2010 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: No ERP escalation on gpn_ft eval If the evaluation of GPN_FT requests wants to remove an invalid port from the system the zfcp_erp_port_shutdown function is triggered. Depending on the system status a superior action (e.g. adapter reopen) is required. This can lead to an invalid mem access of the port struct which might be freed at the time since the superior action is not holding a reference of the port which triggered this ERP action. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
edaed859 |
|
08-Sep-2010 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Replace status modifier functions. Replace the zfcp_modify_<xxx>_status functions and its accompanying wrappers with dedicated status modifier functions. This eases code readability and maintenance. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
a1ca4831 |
|
08-Sep-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Move ACL/CFDC code to zfcp_cfdc.c Move the code evaluating the ACL/CFDC specific errors to the file zfcp_cfdc.c. With this change, all code related to the old access control feature is kept in one file, not split across zfcp_erp.c and zfcp_fsf.c. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
b62a8d9b |
|
08-Sep-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit This is the large change to switch from using the data in zfcp_unit to zfcp_scsi_dev. Keeping everything working requires doing the switch in one piece. To ensure that no code keeps using the data in zfcp_unit, this patch also removes the data from zfcp_unit that is now being replaced with zfcp_scsi_dev. For zfcp, the scsi_device together with zfcp_scsi_dev exist from the call of slave_alloc to the call of slave_destroy. The data in zfcp_scsi_dev is initialized in zfcp_scsi_slave_alloc and the LUN is opened; the final shutdown for the LUN is run from slave_destroy. Where the scsi_device or zfcp_scsi_dev is needed, the pointer to the scsi_device is passed as function argument and inside the function converted to the pointer to zfcp_scsi_dev; this avoids back and forth conversion betweeen scsi_device and zfcp_scsi_dev. While changing the function arguments from zfcp_unit to scsi_device, the functions names are renamed form "unit" to "lun". This is to have a seperation between zfcp_scsi_dev/LUN and the zfcp_unit; only code referring to the remaining configuration information in zfcp_unit struct uses "unit". Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
fdbd1c5e |
|
08-Sep-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Allow running unit/LUN shutdown without acquiring reference With the change for the LUN data to be part of the scsi_device, the slave_destroy callback will be the final call to the zfcp_erp_unit_shutdown function. The erp tries to acquire a reference to run the action asynchronously and fail, if it cannot get the reference. But calling scsi_device_get from slave_destroy will fail, because the scsi_device is already in the state SDEV_DEL. Introduce a new call into the zfcp erp to solve this: The function zfcp_erp_unit_shutdown_wait will close the LUN and wait for the erp to finish without acquiring an additional reference. The wait allows to omit the reference; the caller waiting for the erp to finish already has a reference that holds the struct in place. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
f7bd7c36 |
|
08-Jul-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix retry after failed "open port" erp action Trying to enqueue a port erp action from the port erp strategy will fail in zfcp_erp_required_act. To try the same action again, return ZFCP_ERP_FAILED. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
9c785d94 |
|
08-Jul-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fail erp after timeout After a timeout notification, do not try to run the erp strategy. Return from the erp with "failed" to possibly trigger a retry. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
5a7de559 |
|
08-Jul-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Register SCSI devices after successful fc_remote_port_add When the successful return of an adisc is the final step to set the port online, the registration of SCSI devices might be omitted. SCSI devices that have been removed before (due to a short dev_loss_tmo setting) might not be attached again. The problem is that the registration of SCSI devices is done only after erp has finished. The correct place would be after the call to fc_remote_port_add to mimick the scan in the FC transport class. Change the registration of SCSI devices to be triggered after the fc_remote_port_add call. For the initial inquiry command to succeed, the unit must also be open. If the unit reopen is still pending, the inquiry command to the LUN will be deferred with DID_IMM_RETRY, so there is no harm from this approach. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
097ef3bd |
|
08-Jul-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Do not try "forced close" when port is already closed When the port is already "physically closed" try the reopen instead. There is no way to send a "physically close" to an already closed port. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
5767620c |
|
08-Jul-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED When the REOPEN_PORT_FORCED erp action succeeds, the port has been closed. A REOPEN_PORT will try to open the port after the REPORT_PORT_FORCED. The rport should only be unblocked after the successful completion of the reopen port. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
8d88cf3f |
|
21-Jun-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Update status read mempool Commit 64deb6efdc5504ce97b5c1c6f281fffbc150bd93 changed the way status read buffers are handled but forgot to adjust the mempool to the new size. Add the call to resize the mempool after the exchange config data. Also use the define instead of the hard coded number in the fsf callback for consistency. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
64deb6ef |
|
30-Apr-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Use status_read_buf_num provided by FCP channel The FCP channel provides the number of status read buffers to issue. Use the provided number instead of the hardcoded number in zfcp. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
615f59e0 |
|
17-Feb-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Rename sysfs_device attribute to dev in zfcp_unit and zfcp_port Kernel code uses dev as short name for the struct device. Rename the sysfs_device in zfcp_unit and zfcp_port to match this convention. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
b6bd2fb9 |
|
17-Feb-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Move FSF request tracking code to new file Move the code for tracking FSF requests to new file to have this code in one place. The functions for adding and removing requests on the I/O path are already inline. The alloc and free functions are only called once, so it does not hurt to inline them and add them to the same file. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
e60a6d69 |
|
17-Feb-2010 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Remove function zfcp_reqlist_find_safe Always use the FSF request id as a reference to the FSF request. With this change the function zfcp_reqlist_find_safe is no longer needed and can be removed. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
9eae07ef |
|
24-Nov-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Assign scheduled work to driver queue The port_scan work was scheduled to the work_queue provided by the kernel. This resulted on SMP systems to a likely situation that more than one scan_work were processed in parallel. This is not required and openes the possibility of race conditions between the removal of invalid ports and the enqueue of just scanned ports. This patch synchronizes the scan_work tasks by scheduling them to adapter local work_queue. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
6b183334 |
|
24-Nov-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Remove STATUS_COMMON_REMOVE flag as it is not required anymore The flag ZFCP_STATUS_COMMON_REMOVE was used to indicate that a resource is not ready to be used or about to be removed from the system. This is now better done by an improved list handling and therefore the additional indicator is not required anymore. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
f3450c7b |
|
24-Nov-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Replace local reference counting with common kref Replace the local reference counting by already available mechanisms offered by kref. Where possible existing device structures were used, including the same functionality. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
ecf0c772 |
|
24-Nov-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Replace global config_lock with local list locks The global config_lock was used to protect the configuration organized in independent lists. It is not necessary to have a lock on driver level for this purpose. This patch replaces the global config_lock with a set of local list locks. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
934aeb587 |
|
14-Oct-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Handle WWPN mismatch in PLOGI payload For ports, zfcp gets the DID from the FC nameserver and tries to open the port. If the open succeeds, zfcp compares the WWPN from the nameserver with the WWPN in the PLOGI payload. In case of a mismatch, zfcp assumes that the DID of the port just changed and we opened the wrong port. This means that zfcp has to forget the DID, lookup the DID again and retry. This error case had a problem that zfcp forgets the DID, but never looks up a new one, stalling the ERP in this case. Fix this by triggering the DID lookup and properly exit from the ERP. The DID lookup will trigger a new ERP action. Also ensure when trying to open the port again with the new DID, first close the open port, even in the NOESC case. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
143bb6bf |
|
18-Aug-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Defer resource allocation to first ccw_set_online call So far, zfcp allocated all resources required for FCP adapters/subchannels when the device was discovered in the ccw_probe callback. If there are lots of unused FCP subchannels attached to a system, this is a waste of resources. To alleviate this, defer the resource allocation to the first call to ccw_set_online. To avoid disruptions during possible following calls to ccw_set_offline and then ccw_set_online, keep the adapter resources until the device is finally being removed via ccw_remove. While doing this, also manage the zfcp erp thread together with all other adapter resources in zfcp_adapter_enqueue/dequeue. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
347c6a96 |
|
18-Aug-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Use kthread API for zfcp erp thread Switch the creation of the zfcp erp thread from the deprecated kernel_thread API to the kthread API. This allows also the removal of some flags in zfcp since the kthread API handles thread creation and shutdown internally. To allow the usage of the kthread_stop function, replace the erp ready semaphore with a waitqueue for waiting until erp actions arrive on the ready queue. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
5771710b |
|
18-Aug-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Update dbf calls Change the dbf data and functions to use the zfcp_dbf prefix throughout the code. Also change the calls to dbf to use zfcp_dbf instead of zfcp_adapter. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
799b76d0 |
|
18-Aug-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Decouple gid_pn requests from erp Don't let the erp wait for gid_pn requests to complete. Instead, queue the gid_pn work, exit erp and let the finished gid_pn work trigger a new port reopen. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
564e1c86 |
|
18-Aug-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Move qdio related data out of zfcp_adapter The zfcp_adapter structure was growing over time to a size of almost one memory page. To reduce the size of the data structure and to seperate different layers, put all qdio related data in the new zfcp_qdio data structure. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
4544683a |
|
18-Aug-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Move workqueue to adapter struct Remove the global driver work queue and replace it with a workqueue local to the adapter. The usage of this workqueue makes this the correct place for the structure. In addition multiple adapters won't block each other due to the serialization of the queued work. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
058b8647 |
|
18-Aug-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Replace fsf_req wait_queue with completion The combination wait_queue/wakeup in conjunction with the flag ZFCP_STATUS_FSFREQ_COMPLETED to signal the completion of an fsfreq was not race-safe and can be better solved by a completion. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
55c770fa |
|
18-Aug-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Implicitly close all wka ports An adapter shutdown implicitly closes all open ports. Make sure to mark all WKA ports as offline, not only the directory server. Also make sure that no pending wka port work is running when the adapter is being removed. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
#
17a093ef |
|
13-Jul-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: avoid double notify in lowmem scenario In a LOWMEM condition an ERP notification would have been sent twice causing an unpredictable behaviour of the ERP. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
85600f7f |
|
13-Jul-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix erp escalation procedure If an action fails, retry it until the erp count exceeds the threshold. If there is something fundamentally wrong, the FSF layer will trigger a more appropriate action depending on the FSF status codes. The followup for successful actions is a different followup than retrying failed actions, so split the code two functions to make this clear. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
ddb3e0c1 |
|
13-Jul-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix logic for physical port close After closing the port, we want it to be "not open" to consider the action to be successful. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
688a1820 |
|
13-Jul-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Use correct flags for zfcp_erp_notify zfcp_erp_notify uses the ZFCP_ERP_STATUS_* flags, so it is ZFCP_STATUS_ERP_LOWMEM instead of ZFCP_ERP_NOMEM. Signalling ZFCP_ERP_FAILED is not necessary, the missing d_id will show that the nameserver did not return the d_id. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
9d544f2b |
|
06-Apr-2009 |
Sven Schuetz <sven@linux.vnet.ibm.com> |
[SCSI] zfcp: Add FC pass-through support Provide the ability to do fibre channel requests from the userspace to our zfcp driver. Patch builds upon extension to the fibre channel tranport class by James Smart and Seokmann Ju. See here http://marc.info/?l=linux-scsi&m=123808882309133&w=2 Signed-off-by: Sven Schuetz <sven@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
ea460a81 |
|
15-May-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Changed D_ID left port disabled If the destination ID (D_ID) of a remote storage port changed, e.g. re-plugged cable on the switch in a different switch port, the port was never (re-)attached within Linux. This patch fixes the broken mapping between the WWPN and the D_ID. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
dceab655 |
|
15-May-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Add comments to switch/case fallthroughs Add comments where there is a deliberate fall through in switch/case statements. This makes some code checkers happy and makes it clear that there is no missing break statement. Reviewed-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
94ab4b38 |
|
17-Apr-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: avoid false ERP complete due to sema race The ERP thread is performing a task before it is executing the corresponding down on the semaphore. The response handler of the just started exchange config should wait for the completion by performing a down on this semaphore. Since this semaphore is still positive from the ERP enqueue the handler won't wait and therefore the exchange config will always fail leaving the adapter in error. The problem can be solved by performing the down on the semaphore before starting an ERP task. This is the logically correct order. Only walk the ERP loop if there is a task to perform. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
828bc121 |
|
17-Apr-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Set WKA-port to offline on adapter deactivation The nameserver port might be in state online when the adapter is offlined. On adapter reactivation the nameserver port is not re-opened due to the PORT_ONLINE status. This results in an unsuccessful recovery. In forcing the nameserver port status to offline on all adapter offline events this issue is prevented. Waiting for the reference count to drop to zero in zfcp_wka_port_offline is not required, so remove it. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
92d5193b |
|
17-Apr-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Dont block zfcp_wq with scan When running the scsi_scan from the zfcp workqueue and the target device does not respond, the zfcp workqueue can block until the scsi_scan hits a timeout. Move the work to the scsi host workqueue, since this one is also used for the scan from the SCSI midlayer. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
947a9aca |
|
02-Mar-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: fix queue, scheduled work processing. Ensure the refcounting is correct even if we were not able to schedule a work. In addition we have to make sure no scheduled work is pending while we're dequeing the adapter from the systems environment. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
a2fa0aed |
|
02-Mar-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Block FC transport rports early on errors Use the I/O blocking mechanism in the FC transport class to allow faster failovers for multipathing: - Call fc_remote_port_delete early to set the rport to BLOCKED. - Check the rport status in queuecommand with fc_remote_portchkready to no longer accept new I/O for this port and fail the I/O with the appropriate scsi_cmnd result. - Implement the terminate_rport_io handler to abort all pending I/O requests - Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is running. - When updating the remote port status, check for late changes and update the remote ports status accordingly. Acked-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
5ffd51a5 |
|
02-Mar-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: replace current ERP logging with a more convenient version The current number based id ERP logging is replaced by a string based tag version. The benefit is an easier location of the code in question and the removal of the lengthy array referencing the individual messages. The string (7 bytes) based version does not use more space since those bytes were "used" anyway due to the alignment of the structure. The encoding of the 7 byte string is as follows [0-1] = filename [2-5] = task/function [6] = section Due to the character of this string (fixed length) a string termination is not required here. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
cf13c082 |
|
02-Mar-2009 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: prevent adapter close on initial adapter open An adapter close was always performed whether it was required, (e.g. in an error scenario) or not (e.g. initial open). This patch is changing the process in only doing an adapter close when it is required. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
86f8a1b4 |
|
02-Mar-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Remove UNIT_REGISTERED status flag Use the device pointer in zfcp_unit for tracking if we have a registered SCSI device. With this approach, the flag ZFCP_STATUS_UNIT_REGISTERED is only redundant and can be removed. Acked-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
a5b11dda |
|
02-Mar-2009 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Remove some port flags PORT_PHYS_CLOSING is only set and cleared, but not actually used for status checking. PORT_INVALID_WWPN is set when the GID_PN request does not return a d_id for a remote port, e.g. when a remote port has been unplugged. For this case, the d_id is zero. In the erp we can check the d_id and use the normal escalation procedure that gives up after three retries and remove the special case. PORT_NO_WWPN is unused: Each port in the remote port list has a valid wwpn. The WKA ports are now tracked outside the port list. Remove the PORT_NO_WWPN flag, since this is no longer set for any port. Acked-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
b98478d7 |
|
19-Dec-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: remove DID_DID flag The port flag DID_DID indicates whether we know the current id of the port. This is always set in parallel. Since the id 0 is invalid (because the port id 0 is invalid) we can remove the DID_DID flag: d_id of 0 indicates an invalid d_id != 0 is a valid one. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Acked-by: Felix Beck <felix@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
06499fac |
|
19-Dec-2008 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: fix compile warning Get rid of this one: drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_thread': drivers/s390/scsi/zfcp_erp.c:1400: warning: ignoring return value of 'down_interruptible', declared with attribute warn_unused_result zfcp_erp_thread is a kernel thread which can't receive any signals. So introduce a dummy variable and get rid of the warning. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
ecf39d42 |
|
25-Dec-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[S390] convert zfcp printks to pr_xxx macros. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
bd43a42b7 |
|
25-Dec-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[S390] zfcp: Report microcode level through service level interface Register zfcp with the new /proc/service_level interface to report the FCP microcode level. When the adapter goes offline or a channel path disappears, zfcp unregisters, since the microcode version might change and zfcp does not know about it. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
fca55b6f |
|
26-Nov-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP Waiting for the ERP to be finished in a task running in the global kernel work-queue is a bad idea, especially if the ERP needs to run another job in this work-queue before it can finish. -> deadlock. This patch removes the necessity to wait for a finished ERP from the scan task and moves the job scheduling to the end of the ERP. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
26871c97 |
|
26-Nov-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: verify for correct rport state before scanning for SCSI devs Prevent a SCSI target scan for a rport which have turned invalid in the meantime. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
7ea633ff |
|
04-Nov-2008 |
Martin Petermann <martin.petermann@de.ibm.com> |
[SCSI] zfcp: fix erp timeout cleanup for port open requests If an open port fsf request times out (in erp) the corresponding erp_action member of the fsf request need to set to NULL. If the port structure will be removed later-on there will be still a reference in the fsf request to the non existing erp_action otherwise. Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
b9d3aed7 |
|
10-Oct-2008 |
Cornelia Huck <cornelia.huck@de.ibm.com> |
[S390] more bus_id -> dev_name conversions Some further bus_id -> dev_name() conversions in s390 code. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
41bfcf90 |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: fix double dbf id usage Trace ids 107 and 3 are used twice, fix this to have unique ids for the erp triggers. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
091694a5 |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev Due to the character of a scheduled work we cannot guarantee the LUN register to be finished before an initial device tries to use it. Therefor we have to wait for PENDING_SCSI_WORK flag to be cleared before proceeding. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
9fb3cd86 |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: fix erp list usage without using locks The zfcp_erp_thread was using the nolock version of the dbf function. This resulted in a list access while other tasks could modifying the list. The symptom was an erp thread running at 100% CPU and never returning from the dbf function. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
e4e9ba5d |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport In case of an adapter reopen all rports have to be deleted from the environment. This should only happen for already registered rports otherwise fc_remote_port_delete is called with a NULL pointer. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
b7f15f3c |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: fix deadlock caused by shared work queue tasks Each adapter reopen trigger automatically a scan_port task which is waiting for the ERP to be finished before further processing. Since the initial device setup enqueues adapter, port and LUN which are individual ERP actions, this process would start after everything is done. Unfortunately the port_reopen requires another scheduled work to be finished which is queued after the automatic scan_port -> deadlock ! This fix creates an own work queue for ERP based nameserver requests. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
0406289e |
|
30-Sep-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Simplify zfcp data structures Reduce the size of zfcp data structures by removing unused and redundant members. scsi_lun is only the mangled version of the fcp_lun. So, remove the redundant field and use the fcp_lun instead. Since the queue lock and the pci_batch indicator are only used in the request queue, move them from the common queue struct to the adapter struct. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
7ba58c9c |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: remove all typedefs and replace them with standards Remove typedefs from zfcp, use already existing types instead. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
5ab944f9 |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: attach and release SAN nameserver port on demand Changing the zfcp behaviour from always having the nameserver port open to an on-demand strategy. This strategy reduces the use of limited resources like port connections. The patch provides a common infrastructure which could be used for all WKA ports in future. Also reduce the number of nameserver lookups by changing the zfcp behaviour of always querying the nameserver for the corresponding destination ID of the remote port. If the destination ID has changed during the reopen process we will be informed and then trigger a nameserver query on demand. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
44cc76f2 |
|
30-Sep-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: remove unused references, declarations and flags - Remove unused references and declarations, including one instance of the FC ls_adisc struct that has been defined twice. - Also remove the flags COMMON_OPENING, COMMON_CLOSING, ADAPTER_REGISTERED and XPORT_OK that are only set and cleared, but not checked anywhere. - Remove the zfcp specific atomic_test_mask makro. Simply use atomic_read directly instead. - Remove the zfcp internal sg helper functions and switch the places where it is still used to call sg_virt directly. - With the update of the QDIO code, the QDIO data structures no longer use the volatile type qualifier. Now we can also remove the volatile qualifiers from the zfcp code. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
ff3b24fa |
|
30-Sep-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Update message with input from review Update the kernel messages in zfcp with input from the message review and remove some messages that have been identified as redundant. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
287ac01a |
|
02-Jul-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Cleanup code in zfcp_erp.c Cleanup the code in zfcp_erp.c, move erp internal definititions to this file and move FSF timeout handling to the FSF layer. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
317e6b65 |
|
02-Jul-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Cleanup of code in zfcp_aux.c Overall cleanup of zfcp_aux.c to simplify code and follow kernel coding style. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
cc8c2829 |
|
10-Jun-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Automatically attach remote ports Automatically attach the remote ports in zfcp when the adapter is set online. This is done by querying all available ports from the FC namesever. The scan for remote ports is also triggered by RSCNs and can be triggered manually with the sysfs attribute 'port_rescan'. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
553448f6 |
|
10-Jun-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Message cleanup Cleanup the messages used in the zfcp driver: Remove unnecessary debug and trace message and convert the remaining messages to standard kernel macros. Remove the zfcp message macros and while updating the whole flie also update the copyright headers. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
00bab910 |
|
10-Jun-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Cleanup qdio code Cleanup the interface code from zfcp to qdio. Also move code that belongs to the qdio interface from the erp to the qdio file. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
24073b47 |
|
10-Jun-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Move FC code to new file Move all Fibre Channel related code to new file and cleanup the code while doing so. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
aa0fec62 |
|
18-May-2008 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix sparse warning by providing new entry in dbf drivers/s390/scsi/zfcp_dbf.c:692:2: warning: context imbalance in 'zfcp_rec_dbf_event_thread' - different lock contexts for basic block Replace the parameter indicating if the lock is held with a new entry function that only acquires the lock. This makes the lock handling more visible and removes the sparse warning. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
d26ab06e |
|
18-May-2008 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall Processing of an unsolicted status request can lead to a locking race of the request_queue's queue_lock during the recreation of the used up status read request while still in interrupt context of the response handler. Detaching the 'refill' of the long running status read requests from the handler to a scheduled work is solving this issue. In addition, each refill-run is trying to re-establish the full amount of status read requests, which might have failed in earlier runs. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
1f6f7129 |
|
17-Apr-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: fix 31 bit compile warnings drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_rscn’: drivers/s390/scsi/zfcp_aux.c:1379: warning: cast from pointer to integer of different size drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_plogi’: drivers/s390/scsi/zfcp_aux.c:1432: warning: cast from pointer to integer of different size drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_logo’: drivers/s390/scsi/zfcp_aux.c:1457: warning: cast from pointer to integer of different size .. Just passing pointers rids us of these warnings and improves readability. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
507e4969 |
|
27-Mar-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: Remove obsolete erp_dbf trace This patch removes the now obsolete erp_dbf trace. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
6f4f365e |
|
27-Mar-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: Add trace records for recovery actions. This patch writes trace records for various phases of a recovery action: action being created, action being processed, action continueing asynchronously, action gone, action timed out, action dismissed etc. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
9467a9b3 |
|
27-Mar-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: Trace all triggers of error recovery activity This patch allows any recovery event to be traced back to an exact cause, e.g. a particular request identified by an id (address). Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
698ec016 |
|
27-Mar-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: Add traces for state changes. This patch writes a trace record which provides information about state changes for adapters, ports and units, e.g. target failure, targets becoming online, targets being temporarily blocked due to pending recovery, targets which have been recovered successfully etc. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
348447e8 |
|
27-Mar-2008 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: Add trace records for recovery thread and its queues This patch writes trace records which provide information about the operation of the zfcp error recovery thread and the queues it works on. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
5d67d164 |
|
26-Jan-2008 |
Joe Perches <joe@perches.com> |
[S390] drivers/s390/: Spelling fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
3f0ca62a |
|
19-Dec-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Hold queue lock when checking port handle for ELS command We need to hold the queue-lock when checking whether we still have a valid port handle for the ELS command, i.e whether we can issue this request for this port. If the error recovery is about to close this port, then it competes for the queue-lock. If the close request issued by the error recovery wins, then it is guaranteed that this port has been blocked for other requests. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
d1ad09db |
|
19-Dec-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: fix use after free bug. zfcp_erp_strategy_check_fsfreq() checks if it is safe to access the fsf_req associated with the erp_action that gets passed. To test if it is safe it accesses the fsf_req in order to get its index into the hash list. This is broken since the fsf_req might be freed already and the read index has no meaning. It could lead to memory corruption. Fix this by introducing a new zfcp_reqlist_find_safe() method which just checks if addresses are equal. This is slower, but only gets called in case of error recovery. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
1de1b43b |
|
04-Nov-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix deadlock when adding invalid LUN When adding an invalid LUN, there is a deadlock between the add via scsi_scan_target and the slave_destroy handler: The handler waits for the scan to complete, but for an invalid unit, scsi_scan_target directly calls the slave_destroy handler. Fix the deadlock by removing the wait in the slave_destroy handler, it was not necessary anyway. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
18edcdbd |
|
04-Nov-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Specify waiting times in ERP in seconds It is not necessary to use jiffies or milliseconds to specify waiting times that last a couple of seconds. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
86e8dfc5 |
|
15-Nov-2007 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: fix cleanup of dismissed error recovery actions Calling zfcp_erp_strategy_check_action() after zfcp_erp_action_to_running() in zfcp_erp_strategy() might cause an unbalanced up() for erp_ready_sem, which makes the zfcp recovery fail somewhere along the way: erp thread processing erp_action: | | someone waking up erp thread for erp_action | | | | someone else dismissing erp_action: | | | V V V write_lock_irqsave(&adapter->erp_lock, flags); ... if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) { zfcp_erp_action_to_ready(erp_action); up(&adapter->erp_ready_sem); /* first up() for erp_action */ } write_unlock_irqrestore(&adapter->erp_lock, flags); write_lock_irqsave(&adapter->erp_lock, flags); ... zfcp_erp_action_to_running(erp_action); write_unlock_restore(&adapter->erp_lock, flags); /* processing erp_action */ write_lock_irqsave(&adapter->erp_lock, flags); ... erp_action->status |= ZFCP_STATUS_ERP_DISMISSED; if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) { zfcp_erp_action_to_ready(erp_action); up(&adapter->erp_ready_sem); /* second, unbalanced up() for erp_action */ } ... write_unlock_restore(&adapter->erp_lock, flags); write_lock_irqsave(&adapter->erp_lock, flags); if (erp_action->status & ZFCP_STATUS_ERP_DISMISSED) { zfcp_erp_action_dequeue(erp_action); retval = ZFCP_ERP_DISMISSED; } ... write_unlock_restore(&adapter->erp_lock, flags); down(&adapter->erp_ready_sem); /* this down() is meant to balance the first up() */ The erp thread must not dismiss an erp_action after moving that action to erp_running_head. Instead it should just go through the down() operation, which balances the first up(), and run through zfcp_erp_strategy one more time for the second up(), which eventually cleans up erp_action. Which is similar to the normal processing of an event for erp_action doing something asynchronously (e.g. waiting for the completion of an fsf_req). This only works if we make sure that a dismissed erp_action is passed to zfcp_erp_strategy() prior to the other action, which caused actions to be dismissed. Therefore the patch implements this rule: running actions go to the head of the ready list; new actions go to the tail of the ready list; the erp thread picks actions to be processed from the ready list's head. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Acked-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
d0076f77 |
|
15-Nov-2007 |
Martin Peschke <mp3@de.ibm.com> |
[SCSI] zfcp: fix dismissal of error recovery actions zfcp_erp_action_dismiss() used to ignore any actions in the ready list. This is a bug. Any action superseded by a stronger action needs to be dismissed. This patch changes zfcp_erp_action_dismiss() so that it dismisses actions regardless of their list affiliation. The ERP thread is able to handle this. It is important to kick the erp thread only for actions in the running list, though, as an imbalance of wakeup signals would confuse the erp thread otherwise. Signed-off-by: Martin Peschke <mp3@de.ibm.com> Acked-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
#
642f14903 |
|
24-Oct-2007 |
Jens Axboe <jens.axboe@oracle.com> |
SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
#
73fc4f0d |
|
23-Oct-2007 |
Jens Axboe <jens.axboe@oracle.com> |
s390 zfcp: sg fixups Based on initial patch from Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
#
41fa2ada |
|
07-Sep-2007 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: whitespace cleanup Cleanup the whitepace from the entire zfcp driver to prevent to have those changes in future feature or function patches. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
52ef11a7 |
|
28-Aug-2007 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: cleanup, separation of ERP, non ERP-version for exchange_ functions cleanup, using ERP request mempool for all ERP versions of the exchange functions (exchange_config (ECD), exchange_port (EPD) ) providing individual versions of the ECD, EPD functions for ERP and other purposes (_sync). Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
cc16ceba |
|
28-Aug-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: avoid if (whatever) ; constructs. Avoid "if (whatever) ;" constructs since they seem to confuse people, even if there is a comment. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
364c8558 |
|
12-Oct-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] Get rid of a bunch of sparse warnings again. Also removes a bunch of ^L in drivers/s390/cio/cmf.c Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0d661327 |
|
18-Jul-2007 |
Swen Schillig <swen@vnet.ibm.com> |
[SCSI] zfcp: Replace kmalloc/memset with kzalloc Memory allocated with kmalloc is always initialzed to 0 with memset. Replace the two calls with kzalloc, that already does both steps. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
c7f6b3a3 |
|
29-May-2007 |
Volker Sameske <sameske@de.ibm.com> |
[SCSI] zfcp: clear adapter status flags during adapter shutdown In some cases we did not reset some adapter status flags properly. This patch clears these flags during FCP adapter shutdown. Signed-off-by: Volker Sameske <sameske@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
3b02191a |
|
08-May-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: clear adapter failed flag if an fsf request times out. Must clear adapter failed flag if an fsf request times out. This is necessary because on link down situations the failed flags gets set but the QDIO queues are still up. Since an adapter reopen will be skipped if the failed flag is set an adapter_reopen that is issued on fsf request timeout has no effect if the local link is down. Might lead to locked up system if the SCSI stack is waiting for abort completion. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
ca2d02c2 |
|
08-May-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: rework request ID management. Simplify request ID management and make sure that frequently used functions are inlined. Also fix a memory leak in zfcp_adapter_enqueue() which only gets hit in error handling. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
5f852be9 |
|
08-May-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI The SCSI stack requires low level drivers to register and unregister devices. For zfcp this leads to the situation where zfcp calls the SCSI stack, the SCSI tries to scan the new device and the scan SCSI command fails. This would require the zfcp erp, but the erp thread is already blocked in the register call. The fix is to make sure that the calls from the ERP thread to the SCSI stack do not block the ERP thread. In detail: 1) Use a workqueue to avoid blocking of the scsi_scan_target calls. 2) When removing a unit make sure that no scsi_scan_target call is pending. 3) Replace scsi_flush_work with scsi_target_unblock. This avoids blocking and has the same result. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
1d589edf |
|
08-May-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: print S_ID and D_ID with 3 bytes S_ID and D_ID are defined in the FCP spec as 3 byte fields. Change the output in zfcp print statements accordingly to print them with only 3 bytes. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
9e3738f3 |
|
28-Mar-2007 |
Christof Schmitt <christof.schmitt@de.ibm.com> |
[SCSI] zfcp: fix initialization of FSF timer Correctly initialize the timer for FSF requests with jiffies + timeout. Cc: Swen Schillig <swen@vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
98051995 |
|
09-Feb-2007 |
Swen Schillig <swen.schillig@freenet.de> |
[SCSI] zfcp: removed wrong comment commit 07a105136f07f0cf1b476383e43033b8a65e13ff Author: Swen Schillig <swen@vnet.ibm.com> Date: Fri Feb 9 09:58:09 2007 +0100 removed wrong comment Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
ca880cf9 |
|
09-Feb-2007 |
Swen Schillig <swen.schillig@freenet.de> |
[SCSI] zfcp: use of uninitialized variable commit 988d955c3314336d716a9208f3d565b06f262e07 Author: Swen Schillig <swen@vnet.ibm.com> Date: Fri Feb 9 09:40:11 2007 +0100 Use of uninitialized variable. ERP action might not be finished accordingly. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
2b67fc46 |
|
05-Feb-2007 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] Get rid of a lot of sparse warnings. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
6aae8738 |
|
10-Oct-2006 |
Al Viro <viro@ftp.linux.org.uk> |
[PATCH] drivers/s390 misc sparse annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
81654286 |
|
18-Sep-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix: avoid removal of fsf reqs before qdio queues are down Fix the fix ... One of my previous fixes introduced removal of all fsf requests in zfcp's eh_host_reset_handler. But this must not happen before qdio queues are shut down. So, I revert the changes of zfcp_scsi_eh_host_reset_handler. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
2abbe866 |
|
18-Sep-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: introduce struct timer_list in struct zfcp_fsf_req This instance will be used whenever a timer is needed for a request by zfcp. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
f6c0e7a7 |
|
02-Aug-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: minor erp bug fixes Bug fixes for zfcp's erp: - trigger adapter reopen if do_QDIO fails - avoid erp deadlock if registration of scsi target or remote port hang - do not treat as error if exchange port data fails - decrease timeout for target reset and aborts - mark unit failed if slave_destroy is called Additionally some code cleanup was done: - made some functions void when retval is not of interest - shortened initialization of zfcp's host_template - corrected some comments Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
fea9d6c7 |
|
02-Aug-2006 |
Volker Sameske <sameske@de.ibm.com> |
[SCSI] zfcp: improve management of request IDs Improve request handling. Use hash table to manage request IDs. Signed-off-by: Volker Sameske <sameske@de.ibm.com> Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
9f09c548 |
|
03-Jul-2006 |
Heiko Carstens <hca@linux.ibm.com> |
[PATCH] zfcp: fix incorrect usage of erp_lock ================================= [ INFO: inconsistent lock state ] --------------------------------- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (&adapter->erp_lock){+-..}, at: [<000000000026c7f8>] zfcp_erp_async_handler+0x3c/0x70 {hardirq-on-W} state was registered at: [<000000000005f33e>] __lock_acquire+0x30a/0xed0 [<00000000000604ae>] lock_acquire+0x9a/0xc8 [<000000000035a7ae>] _write_lock+0x4e/0x68 [<000000000026d822>] zfcp_erp_adapter_strategy_generic+0x286/0xd94 [<000000000026fd72>] zfcp_erp_strategy_do_action+0x91e/0x1a94 [<0000000000271a3a>] zfcp_erp_thread+0x21a/0x1568 [<0000000000019096>] kernel_thread_starter+0x6/0xc [<0000000000019090>] kernel_thread_starter+0x0/0xc irq event stamp: 12078 hardirqs last enabled at (12077): [<0000000000019416>] cpu_idle+0x206/0x250 hardirqs last disabled at (12078): [<0000000000020458>] io_no_vtime+0xc/0x1c softirqs last enabled at (12072): [<0000000000040b62>] __do_softirq+0x13a/0x180 softirqs last disabled at (12059): [<000000000001fd58>] do_softirq+0xec/0xf0 other info that might help us debug this: no locks held by swapper/0. stack backtrace: 00000000012bb648 0000000000000002 0000000000000000 00000000012bb758 00000000012bb6c0 0000000000399122 0000000000399122 0000000000016b0a 0000000000000000 0000000000000001 0000000000000000 00000000004660e8 0000000000000000 000000000000000d 00000000012bb6b8 00000000012bb730 0000000000368b90 0000000000016b0a 00000000012bb6b8 00000000012bb708 Call Trace: ([<0000000000016a26>] show_trace+0x76/0xdc) [<0000000000016b2c>] show_stack+0xa0/0xd0 [<0000000000016b8a>] dump_stack+0x2e/0x3c [<000000000005e3da>] print_usage_bug+0x27e/0x290 [<000000000005e934>] mark_lock+0x548/0x6c0 [<000000000005fb0c>] __lock_acquire+0xad8/0xed0 [<00000000000604ae>] lock_acquire+0x9a/0xc8 [<000000000035a662>] _write_lock_irqsave+0x62/0x80 [<000000000026c7f8>] zfcp_erp_async_handler+0x3c/0x70 [<0000000000279178>] zfcp_fsf_req_dispatch+0xd8/0x1fa8 [<000000000027e538>] zfcp_fsf_req_complete+0x104/0xe4c [<0000000000274534>] zfcp_qdio_reqid_check+0xf4/0x178 [<000000000027469e>] zfcp_qdio_response_handler+0xe6/0x430 [<0000000000219dd4>] tiqdio_thinint_handler+0xd20/0x213c [<000000000020229a>] do_adapter_IO+0xb2/0xc0 [<0000000000206f32>] do_IRQ+0x136/0x16c [<0000000000020462>] io_no_vtime+0x16/0x1c [<0000000000019432>] cpu_idle+0x222/0x250 ([<0000000000019416>] cpu_idle+0x206/0x250) [<000000000001405a>] rest_init+0x5a/0x68 [<0000000000536998>] start_kernel+0x39c/0x3dc [<0000000000013046>] _stext+0x46/0x1000 Fix incorrect usage of erp_lock. Using the write_lock() variant is wrong, since this might lead to deadlocks. Acked-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
d6e05edc |
|
26-Jun-2006 |
Andreas Mohr <andi@lisas.de> |
spelling fixes acquired (aquired) contiguous (contigious) successful (succesful, succesfull) surprise (suprise) whether (weather) some other misspellings Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
#
338151e0 |
|
22-May-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable If zfcp's port erp fails we now call fc_remote_port_delete. This helps to avoid offlined scsi devices if scsi commands time out due to path failures. When an adapter erp fails we call fc_remote_port_delete for all ports on that adapter. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
75bfc283 |
|
22-May-2006 |
Ralph Wuerthner <rwuerthn@de.ibm.com> |
[SCSI] zfcp: evaluate plogi payload to set maxframe_size, supported_classes of rports Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com> Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
06506d00 |
|
22-May-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: (cleanup) removed superfluous macros, struct members, typedefs Removed some macros, struct members and typedefs which were unused or not necessary. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
ec4081c6 |
|
22-May-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: (cleanup) kmalloc/kzalloc replacement Replace kmalloc/memset by kzalloc or kcalloc. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
ca3271b4 |
|
22-May-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: (cleanup) remove useless comments Removed some useless comments. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
4a9d2d8b |
|
22-May-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: (cleanup) shortened copyright and author information Copyright update, shortened file headers, shortened author information. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
ad58f7db |
|
09-Mar-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix device registration issues The patch fixes following issues: (1) Replace scsi_add_device with scsi_scan_target. (Thus the rport instead of the scsi_host becomes parent of a scsi_target again.) (2) Avoid scsi_device allocation during registration of an remote port. (Would be done during fc_scsi_scan_rport.) (3) Fix queuecommand behaviour when an zfcp unit is blocked. (Call scsi_done with DID_NO_CONNECT instead of returning SCSI_MLQUEUE_DEVICE_BUSY otherwise we might end up waiting for completion in blk_execute_rq for ever.) Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
61c41823 |
|
10-Feb-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix: avoid race between fc_remote_port_add and scsi_add_device Flush workqueue of a scsi host after a remote port for that host is registered at the fc transport class. Otherwise immediate registration of a scsi device on that host is racy. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
2f8f3ed5 |
|
10-Feb-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix adapter erp when link is unplugged Remove endless polling for replug of the local link. Just wait for link up notification. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
e018ba1f |
|
01-Feb-2006 |
Heiko Carstens <hca@linux.ibm.com> |
[PATCH] s390: Remove CVS generated information - Remove all CVS generated information like e.g. revision IDs from drivers/s390 and include/asm-s390 (none present in arch/s390). - Add newline at end of arch/s390/lib/Makefile to avoid diff message. Acked-by: Andreas Herrmann <aherrman@de.ibm.com> Acked-by: Frank Pavlic <pavlic@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
575c9687 |
|
14-Jan-2006 |
Adrian Bunk <bunk@stusta.de> |
spelling: s/appropiate/appropriate/ Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
#
ad757cdf |
|
12-Jan-2006 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: transport class adaptations II Replaced zfcp adapter attributes with fc_host attributes: fc_topology by port_type, physical_wwpn by permanent_port_name. Make use of fc_host attribute supported_speeds. Removed zfcp adapter attribute physical_s_id. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
2448c459 |
|
30-Nov-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix adapter initialization Fixed various problems in opening sequence of adapters which was previously changed with NPIV support: o corrected handling when exchange port data function is not supported, otherwise adapters on z900 cannot be opened anymore o corrected setup of timer for exchange port data if called from error recovery o corrected check of return code of exchange config data Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
c48a29d0 |
|
30-Nov-2005 |
Heiko Carstens <hca@linux.ibm.com> |
[SCSI] zfcp: fix spinlock initialization Move initialization of locks and lists to adapter allocation function. Otherwise we might end up with some uninitialized locks, like e.g. the erp locks which only will be inititialized if an error recovery thread for an adapter will be started. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
13e1e1f0 |
|
19-Sep-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: add additional fc_host attributes this patch adds some fc host attributes and removes its equivalents from the zfcp_adapter structure and zfcp specific sysfs subtree. Furthermore it removes superfluous calls to fc_remort_port_delete when an adapter is set offline because rports will be removed by fc_remove_host anyway. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
aef4a983 |
|
13-Sep-2005 |
Maxim Shchetynin <maxim@de.ibm.com> |
[SCSI] zfcp: provide support for NPIV N_Port ID Virtualization (NPIV) allows a single FCP port to appear as multiple, distinct ports providing separate port identification. NPIV is supported by FC HBAs on System z9. zfcp was adapted to support this new feature. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
3734d24b |
|
13-Sep-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix race conditions when accessing erp_action lists o always use locking when changing erp_action lists, o avoid escalation to ERP_ACTION_REOPEN_PORT_FORCED if erp_action is still in use for ERP_ACTION_REOPEN_PORT Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
3859f6a2 |
|
27-Aug-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[PATCH] zfcp: add rports to enable scsi_add_device to work again This patch fixes a severe problem with 2.6.13-rc7. Due to recent SCSI changes it is not possible to add any LUNs to the zfcp device driver anymore. With registration of remote ports this is fixed. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Acked-by: James Bottomley <jejb@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
d736a27b |
|
13-Jun-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix handling of port boxed and lun boxed fsf states From: Maxim Shchetynin <maxim@de.ibm.com> Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
1db2c9c0 |
|
13-Jun-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix bug during adapter shutdown Fixes a race between zfcp_fsf_req_dismiss_all and zfcp_qdio_reqid_check. During adapter shutdown it occurred that a request was cleaned up twice. First during its normal completion. Second when dismiss_all was called. The fix is to serialize access to fsf request list between zfcp_fsf_req_dismiss_all and zfcp_qdio_reqid_check and delete a fsf request from the list if its completion is triggered. (Additionally a rwlock was replaced by a spinlock and fsf_req_cleanup was eliminated.) Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
64b29a13 |
|
13-Jun-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix: problem in send_els_handler when D_ID assignment changes From: Maxim Shchetynin <maxim@de.ibm.com> Fixes a bug in zfcp_send_els_handler. If D_ID assignments for ports are changing between initiation of one ELS request and its completion the wrong port might be accessed in the completion for that ELS request. Thus a pointer to the port has to be passed for ELS requests to identify the port structure if required. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
22753fa5 |
|
13-Jun-2005 |
Andreas Herrmann <aherrman@de.ibm.com> |
[SCSI] zfcp: fix: allow more time for adapter initialization From: Maxim Shchetynin <maxim@de.ibm.com> Extend the time for adapter initialization: In case of protocol status HOST_CONNECTION_INITIALIZING for the exchange config data command do a first retry in 1 second, then double the sleep time for each following retry until recovery exceeds 2 minutes. The old behaviour of allowing 6 retries with .5 seconds delay between retries was insufficient and qdio queues were shut down too erarly. Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
6f71d9bc |
|
10-Apr-2005 |
James Bottomley <jejb@titanic.il.steeleye.com> |
zfcp: add point-2-point support From: Andreas Herrmann <aherrman@de.ibm.com> This patch mainly introduces support for point-2-point topology. From: Heiko Carstens <heiko.carstens@de.ibm.com> From: Maxim Shchetynin <maxim@de.ibm.com> From: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
91bbfbda |
|
10-Apr-2005 |
James Bottomley <jejb@titanic.il.steeleye.com> |
zfcp: add point-2-point support From: Andreas Herrmann <aherrman@de.ibm.com> This patch mainly introduces support for point-2-point topology. From: Heiko Carstens <heiko.carstens@de.ibm.com> From: Maxim Shchetynin <maxim@de.ibm.com> From: Andreas Herrmann <aherrman@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|