History log of /linux-master/drivers/net/ethernet/mellanox/mlx5/core/health.c
Revision Date Author Comments
# 91a72ada 26-Dec-2023 Gal Pressman <gal@nvidia.com>

net/mlx5: Remove initial segmentation duplicate definitions

Device definitions belong in mlx5_ifc, remove the duplicates in
mlx5_core.h.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 17aa2d79 25-Jan-2024 Moshe Shemesh <moshe@nvidia.com>

net/mlx5: remove fw reporter dump option for non PF

In case function is not a Physical Function it is not allowed to get FW
core dump, so if tried it will fail the fw health reporter dump option.
Instead of failing, remove the option of fw_fatal health reporter dump
for such function.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# daa6a6eb 24-Jan-2024 Moshe Shemesh <moshe@nvidia.com>

net/mlx5: remove fw_fatal reporter dump option for non PF

In case function is not a Physical Function it is not allowed to collect
crdump, so if tried it will fail the fw_fatal health reporter dump
option. Instead of failing on permission, remove the option of fw_fatal
health reporter dump for such function.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# ac8082a3 16-Jan-2024 Aya Levin <ayal@nvidia.com>

net/mlx5: Fix fw reporter diagnose output

Restore fw reporter diagnose to print the syndrome even if it is zero.
Following the cited commit, in this case (syndrome == 0) command returns no
output at all.

This fix restores command output in case syndrome is cleared:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 0

Fixes: d17f98bf7cc9 ("net/mlx5: devlink health: use retained error fmsg API")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# d17f98bf 18-Oct-2023 Przemek Kitszel <przemyslaw.kitszel@intel.com>

net/mlx5: devlink health: use retained error fmsg API

Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e0cc92fd 03-Aug-2023 Moshe Shemesh <moshe@nvidia.com>

net/mlx5: Add a health error syndrome for pci data poisoned

Add new health error syndrome to indicate that pci data poisoned error
has been received while fetching device ICM data.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# ab943e2e 26-Jun-2023 Gal Pressman <gal@nvidia.com>

net/mlx5: Remove health syndrome enum duplication

Health syndrome enum values were duplicated in mlx5_ifc and health.c,
the correct place for them is mlx5_ifc.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 2bd3b292 05-Jun-2023 Juhee Kang <claudiajkang@gmail.com>

net/mlx5: Add header file for events

Separate the event API defined in the generic mlx5.h header into
a dedicated header. And remove the TODO comment in commit
69c1280b1f3b ("net/mlx5: Device events, Use async events chain").

Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# e71383fb 03-May-2023 Shay Drory <shayd@nvidia.com>

net/mlx5: Light probe local SFs

In case user wants to configure the SFs, for example: to use only vdpa
functionality, he needs to fully probe a SF, configure what he wants,
and afterward reload the SF.

In order to save the time of the reload, local SFs will probe without
any auxiliary sub-device, so that the SFs can be configured prior to
its full probe.

The defaults of the enable_* devlink params of these SFs are set to
false.

Usage example:
Create SF:
$ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11
$ devlink port function set pci/0000:08:00.0/32768 \
hw_addr 00:00:00:00:00:11 state active

Enable ETH auxiliary device:
$ devlink dev param set auxiliary/mlx5_core.sf.1 \
name enable_eth value true cmode driverinit

Now, in order to fully probe the SF, use devlink reload:
$ devlink dev reload auxiliary/mlx5_core.sf.1

At this point the user have SF devlink instance with auxiliary device
for the Ethernet functionality only.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# b0bc615d 20-Mar-2023 Maher Sanalla <msanalla@nvidia.com>

net/mlx5: Add vnic devlink health reporter to PFs/VFs

Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:

$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0

Moreover, add documentation on the reporter functionality and the
counters description.

While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 8ff38e73 13-Mar-2023 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Stop waiting for PCI up if teardown was triggered

If driver teardown is called while PCI is turned off, there is a race
between health recovery and teardown. If health recovery already started
it will wait 60 sec trying to see if PCI gets back and it can recover,
but actually there is no need to wait anymore once teardown was called.

Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break
waiting for PCI up.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 72ed5d56 26-Jan-2023 Jiri Pirko <jiri@nvidia.com>

net/mlx5: Suspend auxiliary devices only in case of PCI device suspend

The original behavior introduced by commit c6acd629eec7 ("net/mlx5e: Add
support for devlink-port in non-representors mode") correctly
re-instantiated uplink devlink port and related netdevice during devlink
reload. However with migration to auxiliary devices, this behaviour
changed.

Restore the original behaviour and tear down auxiliary devices
completely during devlink reload.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 67257cba 05-Jan-2023 Shay Drory <shayd@nvidia.com>

net/mlx5: Remove redundant health work lock

Commit 90e7cb78b815 ("net/mlx5: fix missing mutex_unlock in
mlx5_fw_fatal_reporter_err_work()") introduced another checking of
MLX5_DROP_HEALTH_NEW_WORK. At this point, the first check of
MLX5_DROP_HEALTH_NEW_WORK is redundant and so is the lock that
protects it.

Remove the lock and rename MLX5_DROP_HEALTH_NEW_WORK to reflect these
changes.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 90e7cb78 05-Jan-2023 Yang Yingliang <yangyingliang@huawei.com>

net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work()

Add missing mutex_unlock() before returning from
mlx5_fw_fatal_reporter_err_work().

Fixes: 9078e843efec ("net/mlx5: Avoid recovery in probe flows")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 9078e843 24-Nov-2022 Shay Drory <shayd@nvidia.com>

net/mlx5: Avoid recovery in probe flows

Currently, recovery is done without considering whether the device is
still in probe flow.
This may lead to recovery before device have finished probed
successfully. e.g.: while mlx5_init_one() is running. Recovery flow is
using functionality that is loaded only by mlx5_init_one(), and there
is no point in running recovery without mlx5_init_one() finished
successfully.

Fix it by waiting for probe flow to finish and checking whether the
device is probed before trying to perform recovery.

Fixes: 51d138c2610a ("net/mlx5: Fix health error state handling")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 1330bd98 01-Oct-2022 Maher Sanalla <msanalla@nvidia.com>

net/mlx5: Set default grace period based on function type

Currently, driver sets the same grace period for fw fatal health reporter
to any type of function.

Since the lower level functions are more vulnerable to fw fatal errors as a
result of parent function closure/reload, set a smaller grace period for
the lower level functions, as follows:

1. For ECPF: 180 seconds.
2. For PF: 60 seconds.
3. For VF/SF: 30 seconds.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 9b98d395 01-Oct-2022 Moshe Shemesh <moshe@nvidia.com>

net/mlx5: Start health poll at earlier stage of driver load

Start health poll at earlier stage, so if fw fatal issue occurred before
or during initialization commands such as init_hca or set_hca_cap the
poll health can detect and indicate that the driver is already in error
state.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# fd01b9b5 28-Sep-2022 Colin Ian King <colin.i.king@gmail.com>

net/mlx5: Fix spelling mistake "syndrom" -> "syndrome"

There is a spelling mistake in a devlink_health_report message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 66af4fe3 07-Sep-2022 Gal Pressman <gal@nvidia.com>

net/mlx5: Remove unused functions

Remove functions which are no longer used in the driver:
mlx5e_ipsec_is_tx_flow
mlx5_health_flush
get_cqe_enhanced_num_mini_cqes
get_cqe_l3_hdr_type
mlx5_health_flush
mlx5_fs_is_ipsec_flow
_mlx5_fs_is_outer_ipproto_flow
mlx5_fs_is_outer_tcp_flow
mlx5_fs_is_outer_udp_flow
mlx5_fs_is_vxlan_flow
mlx5_fs_is_outer_ipsec_flow

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# c90005b5 28-Jul-2022 Moshe Shemesh <moshe@nvidia.com>

devlink: Hold the instance lock in health callbacks

Let the core take the devlink instance lock around health callbacks and
remove the now redundant locking in the drivers.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# d3dbdc9f 28-Jul-2022 Moshe Shemesh <moshe@nvidia.com>

net/mlx5: Lock mlx5 devlink health recovery callback

Change devlink instance locks in mlx5 driver to have devlink health
recovery callback locked, while keeping all driver paths which lead to
devl_ API functions called by the driver locked.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 71ab5807 11-Jan-2022 Leon Romanovsky <leon@kernel.org>

net/mlx5: Delete useless module.h include

There is no need in include of module.h in the following files.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# f28a14c1 09-Nov-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Avoid printing health buffer when firmware is unavailable

Use firmware version field as an indication to health buffer's sanity.
When firmware version is 0xFFFFFFFF, deduce that firmware is unavailable
and avoid printing the health buffer to dmesg as it doesn't provide
debug info.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 502e82b9 07-Nov-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Fix access to a non-supported register

Validate MRTC register is supported before triggering a delayed work
which accesses it.

Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 924cc463 06-Nov-2021 Gal Pressman <gal@nvidia.com>

net/mlx5: Fix too early queueing of log timestamp work

The log timestamp work should not be queued before the command interface
is initialized, move it to a later stage in the init flow.

Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 5a1023de 13-Oct-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Add periodic update of host time to firmware

Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# b87ef75c 11-Oct-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Print health buffer by log level

Add log macro which gets log level as a parameter. Use the severity
read from the health buffer and the new log macro to log the health buffer
with severity as log level. Prior to this patch, health buffer was
printed in error log level regardless of its severity. Now the user may
filter dmesg (--level) or change kernel log level to focus on different
severity levels of firmware errors.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# cb464ba5 11-Oct-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Extend health buffer dump

Enhance health buffer to include:
- assert_var5: expose the 6'th assert variable.
- time: error's time-stamp in seconds (epoch time).
- rfr: Recovery Flow Requiered. When set, indicates that the error
cannot be recovered without flow involving reset.
- severity: error's severity value, ranging from emergency to debug.
Expose them in the health buffer dump (dmesg and devlink fw reporter).

Health buffer in dmesg:
mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR:
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800
mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c
mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492
mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758
mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d
mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0
mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR)
mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9
mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error
mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b
mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 32def412 13-Oct-2021 Amir Tzin <amirtz@nvidia.com>

net/mlx5: Read timeout values from DTOR

Replace hard coded timeouts with values stored by firmware in default
timeouts register (DTOR). Timeouts are read during driver load. If DTOR
is not supported by firmware then fallback to hard coded defaults
instead.

Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 8e792700 01-Aug-2021 Leon Romanovsky <leon@kernel.org>

net/mlx5: Delete impossible dev->state checks

New mlx5_core device structure is allocated through devlink_alloc
with\ kzalloc and that ensures that all fields are equal to zero
and it includes ->state too.

That means that checks of that field in the mlx5_init_one() is
completely redundant, because that function is called only once
in the begging of mlx5_core_dev lifetime.

PCI:
.probe()
-> probe_one()
-> mlx5_init_one()

The recovery flow can't run at that time or before it, because relevant
work initialized later in mlx5_init_once().

Such initialization flow ensures that dev->state can't be
MLX5_DEVICE_STATE_UNINITIALIZED at all, so remove such impossible
checks.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 39c538d6 29-Jul-2021 Cai Huoqing <caihuoqing@baidu.com>

net/mlx5: Fix typo in comments

Fix typo:
*vectores ==> vectors
*realeased ==> released
*erros ==> errors
*namepsace ==> namespace
*trafic ==> traffic
*proccessed ==> processed
*retore ==> restore
*Currenlty ==> Currently
*crated ==> created
*chane ==> change
*cannnot ==> cannot
*usuallly ==> usually
*failes ==> fails
*importent ==> important
*reenabled ==> re-enabled
*alocation ==> allocation
*recived ==> received
*tanslation ==> translation

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 7f331bf0 16-Jun-2021 Aya Levin <ayal@nvidia.com>

net/mlx5: Unload device upon firmware fatal error

When fw_fatal reporter reports an error, the firmware in not responding.
Unload the device to ensure that the driver closes all its resources,
even if recovery is not due (user disabled auto-recovery or reporter is
in grace period). On successful recovery the device is loaded back up.

Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 31a91220a 14-Mar-2021 Colin Ian King <colin.king@canonical.com>

net/mlx5: Fix spelling mistakes in mlx5_core_info message

There are two spelling mistakes in a mlx5_core_info message. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# fe06992b 03-Nov-2020 Leon Romanovsky <leon@kernel.org>

net/mlx5: Check returned value from health recover sequence

MLX5_INTERFACE_STATE_UP is far from being reliable check for success to
recover, because it can be changed any time and health logic doesn't
have any locks to protect from it.

The locks are not needed here because health recover is good to have,
but not must to success, so rely on the returned value from the
mlx5_recover_device() as a marker for success/failure.

Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 51d138c2 22-Nov-2020 Shay Drory <shayd@nvidia.com>

net/mlx5: Fix health error state handling

Currently, when we discover a fatal error, we are queueing a work that
will wait for a lock in order to enter the device to error state.
Meanwhile, FW commands are still being processed, and gets timeouts.
This can block the driver for few minutes before the work will manage
to get the lock and enter to error state.

Setting the device to error state before queueing health work, in order
to avoid FW commands being processed while the work is waiting for the
lock.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 38b9f903 07-Oct-2020 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Handle sync reset request event

Once the driver gets sync_reset_request from firmware it prepares for the
coming reset and sends acknowledge.
After getting this event the driver expects device reset, either it will
trigger PCI reset on sync_reset_now event or such PCI reset will be
triggered by another PF of the same device. So it moves to reset
requested mode and if it gets PCI reset triggered by the other PF it
detect the reset and reloads.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b6e0b6be 07-May-2020 Shay Drory <shayd@mellanox.com>

net/mlx5: Fix fatal error handling during device load

Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 9c275ee4 07-Apr-2020 Leon Romanovsky <leon@kernel.org>

net/mlx5: Delete not-used cmd header

The structures defined in the cmd header are not used and can be safely
removed from the driver. This patch removes that file and deletes all
relevant includes.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# 8c702a53 30-Mar-2020 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Fix frequent ioread PCI access during recovery

High frequency of PCI ioread calls during recovery flow may cause the
following trace on powerpc:

[ 248.670288] EEH: 2100000 reads ignored for recovering device at
location=Slot1 driver=mlx5_core pci addr=0000:01:00.1
[ 248.670331] EEH: Might be infinite loop in mlx5_core driver
[ 248.670361] CPU: 2 PID: 35247 Comm: kworker/u192:11 Kdump: loaded
Tainted: G OE ------------ 4.14.0-115.14.1.el7a.ppc64le #1
[ 248.670425] Workqueue: mlx5_health0000:01:00.1 health_recover_work
[mlx5_core]
[ 248.670471] Call Trace:
[ 248.670492] [c00020391c11b960] [c000000000c217ac] dump_stack+0xb0/0xf4
(unreliable)
[ 248.670548] [c00020391c11b9a0] [c000000000045818]
eeh_check_failure+0x5c8/0x630
[ 248.670631] [c00020391c11ba50] [c00000000068fce4]
ioread32be+0x114/0x1c0
[ 248.670692] [c00020391c11bac0] [c00800000dd8b400]
mlx5_error_sw_reset+0x160/0x510 [mlx5_core]
[ 248.670752] [c00020391c11bb60] [c00800000dd75824]
mlx5_disable_device+0x34/0x1d0 [mlx5_core]
[ 248.670822] [c00020391c11bbe0] [c00800000dd8affc]
health_recover_work+0x11c/0x3c0 [mlx5_core]
[ 248.670891] [c00020391c11bc80] [c000000000164fcc]
process_one_work+0x1bc/0x5f0
[ 248.670955] [c00020391c11bd20] [c000000000167f8c]
worker_thread+0xac/0x6b0
[ 248.671015] [c00020391c11bdc0] [c000000000171618] kthread+0x168/0x1b0
[ 248.671067] [c00020391c11be30] [c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80

Reduce the PCI ioread frequency during recovery by using msleep()
instead of cond_resched()

Fixes: 3e5b72ac2f29 ("net/mlx5: Issue SW reset on FW assert")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# ba7d16c7 29-Mar-2020 Eran Ben Elisha <eranbe@mellanox.com>

devlink: Implicitly set auto recover flag when registering health reporter

When health reporter is registered to devlink, devlink will implicitly set
auto recover if and only if the reporter has a recover method. No reason
to explicitly get the auto recover flag from the driver.

Remove this flag from all drivers that called
devlink_health_reporter_create.

All existing health reporters set auto recovery to true if they have a
recover method.

Yet, administrator can unset auto recover via netlink command as prior to
this patch.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d4e82cf4 12-Nov-2019 Aya Levin <ayal@mellanox.com>

net/mlx5: Dump of fw_fatal use updated devlink binary interface

Remove redundant code from fw_fatal reporter's dump callback. Use
updated devlink interface of binary fmsg pair which breaks the output
into chunks internally.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c7ed6d01 27-Sep-2019 Navid Emamdoost <navid.emamdoost@gmail.com>

net/mlx5: fix memory leak in mlx5_fw_fatal_reporter_dump

In mlx5_fw_fatal_reporter_dump if mlx5_crdump_collect fails the
allocated memory for cr_data must be released otherwise there will be
memory leak. To fix this, this commit changes the return instruction
into goto error handling.

Fixes: 9b1f29823605 ("net/mlx5: Add support for FW fatal reporter dump")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# e7a98105 10-Oct-2019 Jiri Pirko <jiri@mellanox.com>

devlink: propagate extack down to health reporter ops

During health reporter operations, driver might want to fill-up
the extack message, so propagate extack down to the health reporter ops.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a6633e11 12-Aug-2019 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Fix delay in fw fatal report handling due to fw report

When fw fatal error occurs, poll health() first detects and reports on a
fw error. Afterwards, it detects and reports on the fw fatal error
itself.

That can cause a long delay in fw fatal error handling which waits in a
queue for the fw error handling to be finished. The fw error handle will
try asking for fw core dump command while fw in fatal state may not
respond and driver will wait for command timeout.

Changing the flow to detect and handle first fw fatal errors and only if
no fatal error detected look for a fw error to handle.

Fixes: d1bf0e2cc4a6 ("net/mlx5: Report devlink health on FW issues")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 8465df40 14-Jul-2019 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Fix crdump chunks print

Crdump repeats itself every chunk of 256bytes.
That is due to bug of missing progressing offset while copying the data
from buffer to devlink_fmsg.

Fixes: 9b1f29823605 ("net/mlx5: Add support for FW fatal reporter dump")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 7369c10f 17-Jul-2019 Chuhong Yuan <hslester96@gmail.com>

net/mlx5: Replace kfree with kvfree

Variable allocated by kvmalloc should not be freed by kfree.
Because it may be allocated by vmalloc.
So replace kfree with kvfree here.

Fixes: 9b1f298236057 ("net/mlx5: Add support for FW fatal reporter dump")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b3bd076f 27-Jan-2019 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Report devlink health on FW fatal issues

Report devlink health on FW fatal issues via fw_fatal_reporter. The
driver recover flow for FW fatal error is now being handled by the
devlink health.

Having the recovery controlled by devlink health, the user has the
ability to cancel the auto-recovery for debug session and run it
manually.

Call mlx5_enter_error_state() before calling devlink_health_report() to
ensure entering device error state even if auto-recovery is off.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 9b1f2982 16-Jan-2019 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Add support for FW fatal reporter dump

Add support of dump callback for mlx5 FW fatal reporter.
The FW fatal dump uses cr-dump functionality to gather cr-space data for
debug. The cr-dump uses vsc interface which is valid even if the FW
command interface is not functional, which is the case in most FW fatal
errors.

Command example and output:
$ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
crdump_data:
00 20 00 01 00 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 ba 82 00 00
0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20
00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa 00
a4 0e 00 00 00 00 00 00 80 c7 fe ff 50 0a 00 00
...
...

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 96c82cdf 11-Dec-2018 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Add fw fatal devlink_health_reporter

Create mlx5_devlink_health_reporter for fw fatal reporter.
The fw fatal reporter is added in addition to the fw reporter and
implements the recover callback.
The point of having two reporters for FW issues, is that we
don't want to run FW recover on any issue, but only fatal ones.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# d1bf0e2c 11-Dec-2018 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Report devlink health on FW issues

Use devlink_health_report() to report any symptom of FW issue as FW
counter miss or new health syndrome.
The FW issues detected in mlx5 during poll_health which is called in
timer atomic context and so health work queue is used to schedule the
reports.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# fd1483fe 11-Dec-2018 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Add support for FW reporter dump

Add support of dump callback for mlx5 FW reporter. Once we trigger FW
dump, the FW will write the core dump to its raw data buffer. The tracer
translates the raw data to traces and save it to a cyclic array. Once
dump is done, the saved traces data is filled into the dump buffer. In
case syndrome is not zero the health buffer content will be printed as
well.

FW dump example:
$ devlink health dump show pci/0000:82:00.0 reporter fw
dump fw traces:
timestamp: 509006640427 lost: false event_id: 185 msg: dump general
info GVMI=0x0000
timestamp: 509006645474 lost: false event_id: 185 msg: GVMI
management info, gvmi_management context:
timestamp: 509006654463 lost: false event_id: 185 msg: [000]:
00000000 00000000 00000000 00000000
timestamp: 509006656127 lost: false event_id: 185 msg: [010]:
00000000 00000000 00000000 00000000
timestamp: 509006656255 lost: false event_id: 185 msg: [020]:
00000000 00000000 00000000 00000000
timestamp: 509006656511 lost: false event_id: 185 msg: [030]:
00000000 00000000 00000000 00000000
timestamp: 509006656639 lost: false event_id: 185 msg: [040]:
00000000 00000000 00000000 00000000
timestamp: 509006656895 lost: false event_id: 185 msg: [050]:
00000000 00000000 00000000 00000000
timestamp: 509006657023 lost: false event_id: 185 msg: [060]:
00000000 00000000 00000000 00000000
timestamp: 509006657180 lost: false event_id: 185 msg: [070]:
00000000 00000000 00000000 00000000
timestamp: 509006659839 lost: false event_id: 185 msg: CMDIF dbase
from IRON: active_dbase_slots = 0x00000000
timestamp: 509006667391 lost: false event_id: 185 msg: GVMI=0x0000
hw_toc context:
timestamp: 509006667647 lost: false event_id: 185 msg: [000]:
00000000 00000000 00000000 fffff000
timestamp: 509006667775 lost: false event_id: 185 msg: [010]:
00000000 00000000 00000000 80d00000
...
...

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 1e34f3ef 11-Dec-2018 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Create FW devlink_health_reporter

Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
implements devlink_health_reporter diagnose callback.

The fw reporter diagnose command can be triggered any time by the user
to check current fw status.
In healthy status, it will return clear syndrome. Otherwise it will
return the syndrome and description of the error type.

Command example and output on healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 0

Command example and output on non healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 8 Description: unrecoverable hardware error

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 3e5b72ac 12-Nov-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5: Issue SW reset on FW assert

If a FW assert is considered fatal, indicated by a new bit in the health
buffer, reset the FW. After the reset go through the normal recovery
flow. Only one PF needs to issue the reset, so an attempt is made to
prevent the 2nd function from also issuing the reset.
It's not an error if that happens, it just slows recovery.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 63cbc552 12-Nov-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5: Handle SW reset of FW in error flow

New mlx5 adapters allow the driver to reset the FW in the event of an
error, this action called "SW Reset". When an SW reset is issued on any
PF all PFs enter reset state which is a recoverable condition. The
existing recovery flow was designed to allow the recovery of a VF after
a PF driver reload. This patch adds the sw reset to the NIC states
as a preparation for sw reset handling.

When a software reset is issued the following occurs:
1. The NIC interface mode is set to 7 while the reset is in progress.
2. Once the reset completes the NIC interface mode is set to 1.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 27b942fb 29-Apr-2019 Parav Pandit <parav@mellanox.com>

net/mlx5: Get rid of storing copy of device name

Currently mlx5 core stores copy of the PCI device name in a
mlx5_priv structure and uses pr_warn, pr_err helpers.

Get rid of the copy of this name; instead store the parent device
pointer that contains name as well as dma specific parameters.
This also allows to use kernel's well defined dev_warn, dev_err, dev_dbg
device specific print routines.

This is also a preparation patch to access non PCI parent device in
future.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 98a8e6fc 29-Mar-2019 Huy Nguyen <huyn@mellanox.com>

net/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info

Replace pci dev_err/warn/info messages with mlx5_core_err/warn/info
messages to provide a better report/debug of different mlx5 device types.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# b09989a2 29-Mar-2019 Huy Nguyen <huyn@mellanox.com>

net/mlx5: Use dev->priv.name instead of dev_name

Use mlx5_core mdev private name in message instead of using pci dev_name
to provide a better report/debug of different mlx5 device types.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 52c368dc 29-Mar-2019 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: Move health and page alloc init to mdev_init

Software structure initialization should be in mdev_init stage.

This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 4cab346b 07-Feb-2019 Huy Nguyen <huyn@mellanox.com>

net/mlx5: No command allowed when command interface is not ready

When EEH is injected and PCI bus stalls, mlx5's pci error detect
function is called to deactivate the command interface and tear down
the device. The issue is that there can be a thread that already
passed MLX5_DEVICE_STATE_INTERNAL_ERROR check, it will send the command
and stuck in the wait_func.

Solution:
Add function mlx5_cmd_flush to disable command interface and clear all
the pending commands. When device state is set to
MLX5_DEVICE_STATE_INTERNAL_ERROR, call mlx5_cmd_flush to ensure all
pending threads waiting for firmware commands completion are terminated.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 02039fb6 26-Nov-2018 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: Remove unused events callback and logic

The mlx5_interface->event callback is not used by mlx5e/mlx5_ib anymore.

We totally remove the delayed events logic work around, since with
the dynamic notifier registration API it is not needed anymore, mlx5_ib
can register its notifier and start receiving events exactly at the moment
it is ready to handle them.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 71edc69c 20-Nov-2018 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: CmdIF, Use async events chain

Remove the explicit call to mlx5_cmd_comp_handler on MLX5_EVENT_TYPE_CMD
and let command interface to register its own handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 16d76083 19-Nov-2018 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: EQ, Different EQ types

In mlx5 we have three types of usages for EQs,
1. Asynchronous EQs, used internally by mlx5 core for
a. FW command completions
b. FW page requests
c. one EQ for all other Asynchronous events

2. Completion EQs, used for CQ completion (we create one per core)

3. *Special type of EQ (page fault) used for RDMA on demand paging
(ODP).

*The 3rd type shouldn't be special at least in mlx5 core, it is yet
another async events EQ with specific use case, it will be removed in
the next two patches, and will completely move its logic to mlx5_ib,
as it is rdma specific.

In this patch we remove use case (eq type) specific fields from
struct mlx5_eq into a new eq type specific structures.

struct mlx5_eq_async;
truct mlx5_eq_comp;
struct mlx5_eq_pagefault;

Separate between their type specific flows.

In the future we will allow users to create there own generic EQs.
for now we will allow only one for ODP in next patches.

We will introduce event listeners registration API for those who
want to receive mlx5 async events.
After that mlx5 eq handling will be clean from feature/user specific
handling.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# f2f3df55 19-Nov-2018 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: EQ, Privatize eq_table and friends

Move unnecessary EQ table structures and declaration from the
public include/linux/mlx5/driver.h into the private area of mlx5_core
and into eq.c/eq.h.

Introduce new mlx5 EQ APIs:

mlx5_comp_vectors_count(dev);
mlx5_comp_irq_get_affinity_mask(dev, vector);

And use them from mlx5_ib or mlx5e netdevice instead of direct access to
mlx5_core internal structures.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# fcd29ad1 09-Aug-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5: Add Fast teardown support

Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 76d5581c 05-Aug-2018 Jack Morgenstein <jackm@dev.mellanox.co.il>

net/mlx5: Fix use-after-free in self-healing flow

When the mlx5 health mechanism detects a problem while the driver
is in the middle of init_one or remove_one, the driver needs to prevent
the health mechanism from scheduling future work; if future work
is scheduled, there is a problem with use-after-free: the system WQ
tries to run the work item (which has been freed) at the scheduled
future time.

Prevent this by disabling work item scheduling in the health mechanism
when the driver is in the middle of init_one() or remove_one().

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# aba46213 25-Feb-2018 Daniel Jurgens <danielj@mellanox.com>

{net, IB}/mlx5: Raise fatal IB event when sys error occurs

All other mlx5_events report the port number as 1 based, which is how FW
reports it in the port event EQE. Reporting 0 for this event causes
mlx5_ib to not raise a fatal event notification to registered clients
due to a seemingly invalid port.

All switch cases in mlx5_ib_event that go through the port check are
supposed to set the port now, so just do it once at variable
declaration.

Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver")
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 777ec2b2 16-Nov-2017 Eugenia Emantayev <eugenia@mellanox.com>

net/mlx5: Fix misspelling in the error message and comment

Fix misspelling in word syndrome.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 0365b047 26-Oct-2017 Kees Cook <keescook@chromium.org>

drivers/net: mellanox: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Matan Barak <matanb@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6377ed0b 19-Oct-2017 Moshe Shemesh <moshe@mellanox.com>

net/mlx5: Fix health work queue spin lock to IRQ safe

spin_lock/unlock of health->wq_lock should be IRQ safe.
It was changed to spin_lock_irqsave since adding commit 0179720d6be2
("net/mlx5: Introduce trigger_health_work function") which uses
spin_lock from asynchronous event (IRQ) context.
Thus, all spin_lock/unlock of health->wq_lock should have been moved
to IRQ safe mode.
However, one occurrence on new code using this lock missed that
change, resulting in possible deadlock:
kernel: Possible unsafe locking scenario:
kernel: CPU0
kernel: ----
kernel: lock(&(&health->wq_lock)->rlock);
kernel: <Interrupt>
kernel: lock(&(&health->wq_lock)->rlock);
kernel: #012 *** DEADLOCK ***

Fixes: 2a0165a034ac ("net/mlx5: Cancel delayed recovery work when unloading the driver")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 590deff6 21-Sep-2017 Allen Pais <allen.lkml@gmail.com>

drivers: net: mlx5: use setup_timer() helper.

Use setup_timer function instead of initializing timer with the
function and data fields.

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 78249c42 13-Jul-2017 Sagi Grimberg <sagi@grimberg.me>

mlx5: convert to generic pci_alloc_irq_vectors

Now that we have a generic code to allocate an array
of irq vectors and even correctly spread their affinity,
correctly handle cpu hotplug events and more, were much
better off using it.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 2a0165a0 30-Mar-2017 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Cancel delayed recovery work when unloading the driver

Draining the health workqueue will ignore future health works including
the one that report hardware failure and thus we can't enter error state
Instead cancel the recovery flow and make sure only recovery flow won't
be scheduled.

Fixes: 5e44fca50470 ('net/mlx5: Only cancel recovery work when cleaning up device')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 8812c24d 09-Feb-2017 Majd Dibbiny <majd@mellanox.com>

net/mlx5: Add fast unload support in shutdown flow

Adding a support to flush all HW resources with one FW command and
skip all the heavy unload flows of the driver on kernel shutdown.
There's no need to free all the SW context since a new fresh kernel
will be loaded afterwards.

Regarding the FW resources, they should be closed, otherwise we will
have leakage in the FW. To accelerate this flow, we execute one command
in the beginning that tells the FW that the driver isn't going to close
any of the FW resources and asks the FW to clean up everything.
Once the commands complete, it's safe to close the PCI resources and
finish the routine.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 3fece5d6 09-Apr-2017 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Continue health polling until it is explicitly stopped

The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.

Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 73dd3a48 23-Feb-2017 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Avoid using pending command interface slots

Currently when firmware command gets stuck or it takes long time to
complete, the driver command will get timeout and the command slot is
freed and can be used for new commands, and if the firmware receive new
command on the old busy slot its behavior is unexpected and this could
be harmful.
To fix this when the driver command gets timeout we return failure,
but we don't free the command slot and we wait for the firmware to
explicitly respond to that command.
Once all the entries are busy we will stop processing new firmware
commands.

Fixes: 9cba4ebcf374 ('net/mlx5: Fix potential deadlock in command mode change')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 0179720d 07-May-2017 Ilan Tayari <ilant@mellanox.com>

net/mlx5: Introduce trigger_health_work function

Introduce new function for entering bad-health state.

This function will be called from FPGA-related logic in a later patch from
asynchronous event (IRQ) context, for that we change the spin lock to an
IRQ-safe one.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 712bfef6 16-Dec-2016 Eli Cohen <eli@mellanox.com>

net/mlx5: Fix version printout in case of health issue

Firmware representation of the firmware version on the health buffer has
changed for newer device. The representation in the initialization
segment does not and will not change. In addition, we print the health
buffer firmware version as a raw hex number.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 04c0c1ab 25-Oct-2016 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: PCI error recovery health care simulation

In case that the kernel PCI error handlers are not called, we will
trigger our own recovery flow.

The health work will give priority to the kernel pci error handlers to
recover the PCI by waiting for a small period, if the pci error handlers
are not triggered the manual recovery flow will be executed.

We don't save pci state in case of manual recovery because it will ruin the
pci configuration space and we will lose dma sync.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 05ac2c0b 25-Oct-2016 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Fix race between PCI error handlers and health work

Currently there is a race between the health care work and the kernel
pci error handlers because both of them detect the error, the first one
to be called will do the error handling.
There is a chance that health care will disable the pci after resuming
pci slot.
Also create a separate WQ because now we will have two types of health
works, one for the error detection and one for the recovery.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2241007b 25-Oct-2016 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Clear health sick bit when starting health poll

The health sick status should be cleared when we start the health poll.
This is crucial for driver reload (unload + load) in order to behave
right in case of health issue.

Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0a91605c 26-Jul-2016 Bhaktipriya Shridhar <bhaktipriya96@gmail.com>

net/mlx5_core/health: Remove deprecated create_singlethread_workqueue

The workqueue health->wq was used as per device private health thread.
This was done to perform delayed work.

The workqueue has a single workitem(&health->work) and
hence doesn't require ordering. It is involved in handling the health of
the device and is not being used on a memory reclaim path.
Hence, the singlethreaded workqueue has been replaced with the use of
system_wq.

Work item has been flushed in mlx5_health_cleanup() to ensure that
there are no pending tasks while disconnecting the driver.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c1d4d2e9 30-Jun-2016 Mohamad Haj Yahia <mohamad@mellanox.com>

net/mlx5: Avoid calling sleeping function by the health poll thread

In internal error state the health poll thread will eventually call
synchronize_irq() (to safely trigger command completions) which might
sleep, so we are calling sleeping function from atomic context which is
invalid.
Here we move trigger_cmd_completions(dev) to enter error state which is
the earliest stage in error state handling.
This way we won't need to wait for next health poll to trigger command
completions and will solve the scheduling while atomic issue.
mlx5_enter_error_state can be called from two contexts, protect it with
dev->intf_state_lock

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c01e0159 19-Apr-2016 Masanari Iida <standby24x7@gmail.com>

treewide: Fix typos in printk

This patch fix spelling typos in printk from various part
of the codes.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 89d44f0a 14-Oct-2015 Majd Dibbiny <majd@mellanox.com>

net/mlx5_core: Add pci error handlers to mlx5_core driver

This patch implement the pci_error_handlers for mlx5_core which allow the
driver to recover from PCI error.

Once an error is detected in the PCI, the mlx5_pci_err_detected is called
and it:
1) Marks the device to be in 'Internal Error' state.
2) Dispatches an event to the mlx5_ib to flush all the outstanding cqes
with error.
3) Returns all the on going commands with error.
4) Unloads the driver.

Afterwards, the FW is reset and mlx5_pci_slot_reset is called and it
enables the device and restore it's pci state.

If the later succeeds, mlx5_pci_resume is called, and it loads the SW
stack.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fd76ee4d 14-Oct-2015 Eli Cohen <eli@mellanox.com>

net/mlx5_core: Fix internal error detection conditions

The detection of a fatal condition has been updated to take into account
the state reported by the device or by detecting an all ones read of the
firmware version which indicates that the device is not accessible.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ac6ea6e8 08-Oct-2015 Eli Cohen <eli@mellanox.com>

net/mlx5_core: Use private health thread for each device

Use a single threaded work queue for each device in the system instead of
using one thread for any device. This is required so we can concurrently
process system error handling for all the devices that need that.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0144a95e 08-Oct-2015 Eli Cohen <eli@mellanox.com>

net/mlx5_core: Use accessor functions to read from device memory

Use ioread function to read health buffer data. In addition, print the
firmware version as a string for readability and also use dev_err to have
the device string to be printed.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 171bb2c5 25-Sep-2015 Eli Cohen <eli@mellanox.com>

net/mlx5_core: Update health syndromes

Update new health monitored syndromes and their descriptions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 78ccb258 25-Sep-2015 Eli Cohen <eli@mellanox.com>

net/mlx5_core: Fix wrong name in struct

The name refers to syndrome so uset ext_synd instread of ext_sync.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 302bdf68 02-Apr-2015 Saeed Mahameed <saeedm@mellanox.com>

net/mlx5_core: Fix Mellanox copyright note

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7d46daba 05-Aug-2013 Eli Cohen <eli@dev.mellanox.co.il>

mlx5: remove health handler plugin

Remove this code, per Dave Miller's request, since it is not being used
anywhere in the kernel.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 582c016e 08-Jul-2013 Roland Dreier <roland@purestorage.com>

mlx5_core: Fixes for sparse warnings

- use be32_to_cpu() instead of cpu_to_be32() where appropriate.
- use proper accessors for pointers marked __iomem.

Signed-off-by: Roland Dreier <roland@purestorage.com>


# e126ba97 07-Jul-2013 Eli Cohen <eli@mellanox.com>

mlx5: Add driver for Mellanox Connect-IB adapters

The driver is comprised of two kernel modules: mlx5_ib and mlx5_core.
This partitioning resembles what we have for mlx4, except that mlx5_ib
is the pci device driver and not mlx5_core.

mlx5_core is essentially a library that provides general functionality
that is intended to be used by other Mellanox devices that will be
introduced in the future. mlx5_ib has a similar role as any hardware
device under drivers/infiniband/hw.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

[ Merge in coccinelle fixes from Fengguang Wu <fengguang.wu@intel.com>.
- Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>