History log of /linux-master/drivers/ntb/test/ntb_perf.c
Revision Date Author Comments
# 45191087 12-Jul-2023 Minjie Du <duminjie@vivo.com>

dtivers: ntb: fix parameter check in perf_setup_dbgfs()

Make IS_ERR() judge the debugfs_create_dir() function return
in perf_setup_dbgfs().

Signed-off-by: Minjie Du <duminjie@vivo.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 0097ae5f 07-Jun-2021 Yang Li <yang.lee@linux.alibaba.com>

NTB: perf: Fix an error code in perf_setup_inbuf()

When the function IS_ALIGNED() returns false, the value of ret is 0.
So, we set ret to -EINVAL to indicate this error.

Clean up smatch warning:
drivers/ntb/test/ntb_perf.c:602 perf_setup_inbuf() warn: missing error
code 'ret'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 34d8673a 09-Jan-2019 Logan Gunthorpe <logang@deltatee.com>

NTB: perf: Fix race condition when run with ntb_test

When running ntb_test, the script tries to run the ntb_perf test
immediately after probing the modules. Since adding multi-port support,
this fails seeing the new initialization procedure in ntb_perf
can not complete instantly.

To fix this we add a completion which is waited on when a test is
started. In this way, run can be written any time after the module is
loaded and it will wait for the initialization to complete instead of
sending an error.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Tested-by: Alexander Fomichev <fomichev.ru@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# b54369a2 09-Jan-2019 Logan Gunthorpe <logang@deltatee.com>

NTB: perf: Fix support for hardware that doesn't have port numbers

Legacy drivers do not have port numbers (but is reliably only two ports)
and was broken by the recent commit that added mult-port support to
ntb_perf. This is especially important to support the cross link
topology which is perfectly symmetric and cannot assign unique port
numbers easily.

Hardware that returns zero for both the local port and the peer should
just always use gidx=0 for the only peer.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Tested-by: Alexander Fomichev <fomichev.ru@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# a9c4211a 09-Jan-2019 Logan Gunthorpe <logang@deltatee.com>

NTB: perf: Don't require one more memory window than number of peers

ntb_perf should not require more than one memory window per peer. This
was probably an off-by-one error.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Tested-by: Alexander Fomichev <fomichev.ru@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 9cb8bfdf 05-May-2020 Sanjay R Mehta <sanju.mehta@amd.com>

ntb_perf: avoid false dma unmap of destination address

The DMA map and unmap of destination address is already being
done in perf_init_test() and perf_clear_test() functions.
Hence avoiding it by making necessary changes in perf_copy_chunk()
function.

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# d7699665 05-May-2020 Sanjay R Mehta <sanju.mehta@amd.com>

ntb_perf: increase sleep time from one milli sec to one sec

After trying to send commands for a maximum of MSG_TRIES
re-tries, link-up fails due to short sleep time(1ms) between
re-tries. Hence increasing the sleep time to one second providing
sufficient time for perf link-up.

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 98f4e140 05-May-2020 Sanjay R Mehta <sanju.mehta@amd.com>

ntb_perf: pass correct struct device to dma_alloc_coherent

Currently, ntb->dev is passed to dma_alloc_coherent
and dma_free_coherent calls. The returned dma_addr_t
is the CPU physical address. This works fine as long
as IOMMU is disabled. But when IOMMU is enabled, we
need to make sure that IOVA is returned for dma_addr_t.
So the correct way to achieve this is by changing the
first parameter of dma_alloc_coherent() as ntb->pdev->dev
instead.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# a0348a4d 09-Apr-2020 Jiasen Lin <linjiasen@hygon.cn>

NTB: Fix static check warning in perf_clear_test

As pthr->dma_chan can't be NULL in this context, so there is
no need to check pthr->dma_chan.

Fixes: 99a06056124d ("NTB: ntb_perf: Fix address err in perf_copy_chunk")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jiasen Lin <linjiasen@hygon.cn>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 99a06056 20-Nov-2019 Jiasen Lin <linjiasen@hygon.cn>

NTB: ntb_perf: Fix address err in perf_copy_chunk

peer->outbuf is a virtual address which is get by ioremap, it can not
be converted to a physical address by virt_to_page and page_to_phys.
This conversion will result in DMA error, because the destination address
which is converted by page_to_phys is invalid.

This patch save the MMIO address of NTB BARx in perf_setup_peer_mw,
and map the BAR space to DMA address after we assign the DMA channel.
Then fill the destination address of DMA descriptor with this DMA address
to guarantee that the address of memory write requests fall into
memory window of NBT BARx with IOMMU enabled and disabled.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Jiasen Lin <linjiasen@hygon.cn>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# ae89339b 29-Mar-2019 Sanjay R Mehta <sanju.mehta@amd.com>

ntb: point to right memory window index

second parameter of ntb_peer_mw_get_addr is pointing to wrong memory
window index by passing "peer gidx" instead of "local gidx".

For ex, "local gidx" value is '0' and "peer gidx" value is '1', then

on peer side ntb_mw_set_trans() api is used as below with gidx pointing to
local side gidx which is '0', so memroy window '0' is chosen and XLAT '0'
will be programmed by peer side.

ntb_mw_set_trans(perf->ntb, peer->pidx, peer->gidx, peer->inbuf_xlat,
peer->inbuf_size);

Now, on local side ntb_peer_mw_get_addr() is been used as below with gidx
pointing to "peer gidx" which is '1', so pointing to memory window '1'
instead of memory window '0'.

ntb_peer_mw_get_addr(perf->ntb, peer->gidx, &phys_addr,
&peer->outbuf_size);

So this patch pass "local gidx" as parameter to ntb_peer_mw_get_addr().

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 12c023d7 15-Feb-2019 Sanjay R Mehta <sanju.mehta@amd.com>

NTB: ntb_perf: Clear stale values in doorbell and command SPAD register

when ntb_perf is unloaded, the command scratchpad register still
retains the last initialized value of PERF_CMD_INVAL. When ntb_perf
is re-loaded and reads peer command scratchpad register and it mis
interprets the peer state as initialized.

To avoid this, clearing the local side command scratchpad register
in perf_disable_service

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# b1ee5998 15-Feb-2019 Sanjay R Mehta <sanju.mehta@amd.com>

NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers

If ntb link disabled before clearing peer's XLAT register, the clearing
won't have any effect since the link is already down. So modified the
sequence so that the link is down only towards the end of the function
after clearing the XLAT register

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 8b2f0336 15-Feb-2019 Sanjay R Mehta <sanju.mehta@amd.com>

NTB: ntb_perf: Increased the number of message retries to 1000

while waiting for the peer ntb_perf to initialize scratchpad
registers, local side ntb_perf might have already exhausted the
maximum number of retries which is currently set to 500. To avoid
this and to give little more time to the peer ntb_perf for scratchpad
initialization, increased the number of retries to 1000

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# fb24ea52 22-Feb-2019 Will Deacon <will@kernel.org>

drivers: Remove explicit invocations of mmiowb()

mmiowb() is now implied by spin_unlock() on architectures that require
it, so there is no reason to call it from driver code. This patch was
generated using coccinelle:

@mmiowb@
@@
- mmiowb();

and invoked as:

$ for d in drivers include/linux/qed sound; do \
spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done

NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation. If you've ended up bisecting to this commit, you can
reintroduce the mmiowb() calls using wmb() instead, which should restore
the old behaviour on all architectures other than some esoteric ia64
systems.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 3b28c987 24-Jan-2018 Serge Semin <fancer.lancer@gmail.com>

NTB: ntb_perf: fix cast to restricted __le32

Sparse is whining about the u32 and __le32 mixed usage in the driver

drivers/ntb/test/ntb_perf.c:288:21: warning: cast to restricted __le32
drivers/ntb/test/ntb_perf.c:295:37: warning: incorrect type in argument 4 (different base types)
drivers/ntb/test/ntb_perf.c:295:37: expected unsigned int [unsigned] [usertype] val
drivers/ntb/test/ntb_perf.c:295:37: got restricted __le32 [usertype] <noident>
...

NTB hardware drivers shall accept CPU-endian data and translate it to
the portable formate by internal means, so the explicit conversions
are not necessary before Scratchpad/Messages API usage anymore.

Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# cd20dc3c 23-Jan-2018 Dan Carpenter <dan.carpenter@oracle.com>

ntb_perf: Fix an error code in perf_copy_chunk()

We accidentally return success if dmaengine_submit() fails. The fix is
to preserve the error code from dma_submit_error().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 1536dc06 19-Jan-2018 Arnd Bergmann <arnd@arndb.de>

NTB: ntb_perf: fix printing of resource_size_t

On 32-bit architectures, resource_size_t is usually 'unsigned int' or
'unsigned long' but not 'unsigned long long', so we get a warning
about printing the wrong data:

drivers/ntb/test/ntb_perf.c: In function 'perf_setup_peer_mw':
drivers/ntb/test/ntb_perf.c:1390:35: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t {aka unsigned int}' [-Werror=format=]

This changes the format string to the special %pa that is already
used elsewhere in the same file.

Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 5648e56d 06-Dec-2017 Serge Semin <fancer.lancer@gmail.com>

NTB: ntb_perf: Add full multi-port NTB API support

Former NTB Performance driver could only work with NTB devices, which
got Scratchpads available and had just two ports. Since there are
devices, which don't have Scratchpads and got more than two peer
ports, the performance measuring tool needs to be rewritten. This
patch adds the ability to test any available NTB peer.
Additionally it allows to set NTB memory windows up using any
available data exchange interface: Scratchpad or Message registers.
Some cleanups are also added here.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 0ed08f82 17-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ntb: remove unneeded DRIVER_LICENSE #defines

There is no need to #define the license of the driver, just put it in
the MODULE_LICENSE() line directly as a text string.

This allows tools that check that the module license matches the source
code license to work properly, as there is no need to unwind the
unneeded dereference, especially when the string is defined just a few
lines above the usage of it.

Reported-and-reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <Allen.Hubbe@emc.com>
Cc: Gary R Hook <gary.hook@amd.com>
Cc: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 980c41c8 03-Aug-2017 Logan Gunthorpe <logang@deltatee.com>

NTB: Ensure ntb_mw_get_align() is only called when the link is up

With Switchtec hardware it's impossible to get the alignment parameters
for a peer's memory window until the peer's driver has configured its
windows. Strictly speaking, the link doesn't have to be up for this,
but the link being up is the only way the client can tell that
the other side has been configured.

This patch converts ntb_transport and ntb_perf to use this function after
the link goes up. This simplifies these clients slightly because they
no longer have to store the alignment parameters. It also tweaks
ntb_tool so that peer_mw_trans will print zero if it is run before
the link goes up.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 32e0f5bf 15-May-2017 Gary R Hook <gary.hook@amd.com>

ntb: Add error path/handling to Debug FS entry creation

If a failure occurs when creating Debug FS entries, unroll all of
the work that's been done.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 8407dd6c 09-May-2017 Gary R Hook <gary.hook@amd.com>

ntb: Add more debugfs support for ntb_perf testing options

The ntb_perf tool uses module parameters to control the
characteristics of its test. Enable the changing of these
options through debugfs, and eliminating the need to unload
and reload the module to make changes and run additional tests.

Add a new module parameter that forces the DMA channel
selection onto the same node as the NTB device (default: true).

- seg_order: Size of the NTB memory window; power of 2.
- run_order: Size of the data buffer; power of 2.
- use_dma: Use DMA or memcpy? Default: 0.
- on_node: Only use DMA channel(s) on the NTB node. Default: true.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 0b93a6db 09-May-2017 Gary R Hook <gary.hook@amd.com>

ntb: Remove debug-fs variables from the context structure

The Debug FS entries manage themselves; we don't need to hang onto
them in the context structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# e9410ff8 09-May-2017 Gary R Hook <gary.hook@amd.com>

ntb: Add a module option to control affinity of DMA channels

The DMA channel(s)/memory used to transfer data to an NTB device
may not be required to be on the same node as the device. Add a
module parameter that allows any candidate channel (aside from
node assocation) and allocated memory to be used.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# d67288a3 10-Jan-2017 Serge Semin <fancer.lancer@gmail.com>

NTB: Alter Scratchpads API to support multi-ports devices

Even though there is no any real NTB hardware, which would have both more
than two ports and Scratchpad registers, it is logically correct to have
Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize
Primary and Secondary topology to split Scratchpad between connected root
devices. Since port-index API introduced, Intel/AMD NTB hardware drivers can
use device port to determine which Scratchpad registers actually belong to
local and peer devices. The same approach can be used if some potential
hardware in future will be multi-port and have some set of Scratchpads.
Here are the brief of changes in the API:
ntb_spad_count() - return number of Scratchpads per each port
ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the
peer device with pidx-index
ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the
peer with pidx-index
ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the
peer with pidx-index

Since there is hardware which doesn't support Scratchpad registers, the
corresponding API methods are now made optional.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 443b9a14 10-Jan-2017 Serge Semin <fancer.lancer@gmail.com>

NTB: Alter MW API to support multi-ports devices

Multi-port NTB devices permit to share a memory between all accessible peers.
Memory Windows API is altered to correspondingly initialize and map memory
windows for such devices:
ntb_mw_count(pidx); - number of inbound memory windows, which can be allocated
for shared buffer with specified peer device.
ntb_mw_get_align(pidx, widx); - get alignment and size restriction parameters
to properly allocate inbound memory region.
ntb_peer_mw_count(); - get number of outbound memory windows.
ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory window

If hardware supports inbound translation configured on the local ntb port:
ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound
memory window so a peer device could access it.
ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound
memory window.

If hardware supports outbound translation configured on the peer ntb port:
ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory
window retrieved from a peer device
ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an
outbound memory window

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 1e530119 13-Dec-2016 Serge Semin <fancer.lancer@gmail.com>

NTB: Add indexed ports NTB API

There is some NTB hardware, which can combine more than just two domains
over NTB. For instance, some IDT PCIe-switches can have NTB-functions
activated on more than two-ports. The different domains are distinguished
by ports they are connected to. So the new port-related methods are added to
the NTB API:
ntb_port_number() - return local port
ntb_peer_port_count() - return number of peers local port can connect to
ntb_peer_port_number(pdix) - return port number by it index
ntb_peer_port_idx(port) - return port index by it number

Current test-drivers aren't changed much. They still support two-ports devices
for the time being while multi-ports hardware drivers aren't added.

By default port-related API is declared for two-ports hardware.
So corresponding hardware drivers won't need to implement it.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 94fc7954 04-May-2017 Gary R Hook <gary.hook@amd.com>

ntb: Correct modinfo usage statement for ntb_perf

The order parameters are powers of 2; adjust the usage information
to use correct mathematical representations.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Fixes: 8a7b6a778a85 ("ntb: ntb perf tool")
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 9644347c 30-Jan-2017 Dave Jiang <dave.jiang@intel.com>

ntb: ntb_perf missing dmaengine_unmap_put

In the normal I/O execution path, ntb_perf is missing a call to
dmaengine_unmap_put() after submission. That causes us to leak
unmap objects.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a77 ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 819baf88 14-Oct-2016 Dan Carpenter <dan.carpenter@oracle.com>

ntb_perf: potential info leak in debugfs

This is a static checker warning, not something I'm desperately
concerned about. But snprintf() returns the number of bytes that
would have been copied if there were space. We really care about the
number of bytes that actually were copied so we should use scnprintf()
instead.

It probably won't overrun, and in that case we may as well just use
sprintf() but these sorts of things make static checkers and code
reviewers happier.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# cdc08982 22-Aug-2016 Nicholas Mc Guire <hofrat@osadl.org>

ntb: make DMA_OUT_RESOURCE_TO HZ independent

schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 35539b54 20-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: clear link_is_up flag when the link goes down.

When the link goes down, the link_is_up flag did not return to
false. This could have caused some subtle corner case bugs
when the link goes up and down quickly.

Once that was fixed, there was found to be a race if the link was
brought down then immediately up. The link_cleanup work would
occasionally be scheduled after the next link up event. This would
cancel the link_work that was supposed to occur and leave ntb_perf
in an unusable state.

To fix this we get rid of the link_cleanup work and put the actions
directly in the link_down event.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 26dc638a 20-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: Wait for link before running test

Instead of returning immediately with an error when the link is
down, wait for the link to come up (or the user sends a SIGINT).

This is to make scripting ntb_perf easier.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 58fd0f3b 20-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: Return results by reading the run file

Instead of having to watch logs, allow the results to be retrieved
by reading back the run file. This file will return "running" when
the test is running and nothing if no tests have been run yet.
It returns 1 line per thread, and will display an error message if the
corresponding thread returns an error.

With the above change, the pr_info calls that returned the results are
then changed to pr_debug calls.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# da573eaa 20-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: Improve thread handling to increase robustness

This commit accomplishes a few things:

1) Properly prevent multiple sets of threads from running at once using
a mutex. Lots of race issues existed with the thread_cleanup.

2) The mutex allows us to ensure that threads are finished before
tearing down the device or module.

3) Don't use kthread_stop when the threads can exit by themselves, as
this is counter-indicated by the kthread_create documentation. Threads
now wait for kthread_stop to occur.

4) Writing to the run file now blocks until the threads are complete.
The test can then be safely interrupted by a SIGINT.

Also, while I was at it:

5) debugfs_run_write shouldn't return 0 in the early check cases as this
could cause debugfs_run_write to loop undesirably.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# fd2ecd88 20-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: Schedule based on time not on performance

When debugging performance problems, if some issue causes the ntb
hardware to be significantly slower than expected, ntb_perf will
hang requiring a reboot because it only schedules once every 4GB.

Instead, schedule based on jiffies so it will not hang the CPU if
the transfer is slow.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 19645a07 07-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_transport: Check the number of spads the hardware supports

I'm working on hardware that currently has a limited number of
scratchpad registers and ntb_ndev fails with no clue as to why. I
feel it is better to fail early and provide a reasonable error message
then to fail later on.

The same is done to ntb_perf, but it doesn't currently require enough
spads to actually fail. I've also removed the unused SPAD_MSG and
SPAD_ACK enums so that MAX_SPAD accurately reflects the number of
spads used.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 4aae9777 03-Jun-2016 Logan Gunthorpe <logang@deltatee.com>

ntb_perf: Allow limiting the size of the memory windows

On my system, dma_alloc_coherent won't produce memory anywhere
near the size of the BAR. So I needed a way to limit this.

It's pretty much copied straight from ntb_transport.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 838850ee 18-Mar-2016 Dave Jiang <dave.jiang@intel.com>

NTB: Fix incorrect clean up routine in ntb_perf

The clean up routine when we failed to allocate kthread is not cleaning
up all the threads, only the same one over and over again.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# ddc8f6fe 18-Mar-2016 Dave Jiang <dave.jiang@intel.com>

NTB: Fix incorrect return check in ntb_perf

kthread_create_no_node() returns error pointers, never NULL. Fix check so
it handles error correctly.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 2572c7fb 10-Mar-2016 Sudip Mukherjee <sudipm.mukherjee@gmail.com>

ntb: fix possible NULL dereference

kmalloc can fail and we should check for NULL before using the pointer
returned by kmalloc.

Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# ee5f750f 07-Mar-2016 Dave Jiang <dave.jiang@intel.com>

ntb: add missing setup of translation window

The perf tool is missing the setup of translation window. Adding call to
setup the translation window for backed memory.

Signed-off-by: John Kading <john.kading@gd-ms.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 1985a881 26-Jan-2016 Arnd Bergmann <arnd@arndb.de>

ntb: perf test: fix address space confusion

The ntb driver assigns between pointers an __iomem tokens, and
also casts them to 64-bit integers, which results in compiler
warnings on 32-bit systems:

drivers/ntb/test/ntb_perf.c: In function 'perf_copy':
drivers/ntb/test/ntb_perf.c:213:10: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
vbase = (u64)(u64 *)mw->vbase;
^
drivers/ntb/test/ntb_perf.c:214:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
dst_vaddr = (u64)(u64 *)dst;
^

This adds __iomem annotations where needed and changes the temporary
variables to iomem pointers to avoid casting them to u64. I did not
see the problem in linux-next earlier, but it show showed up in
4.5-rc1.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a778a85 ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>


# 8a7b6a77 13-Jan-2016 Dave Jiang <dave.jiang@intel.com>

ntb: ntb perf tool

Providing raw performance data via a tool that directly access data from
NTB w/o any software overhead. This allows measurement of the hardware
performance limit. In revision one we are only doing single direction
CPU and DMA writes. Eventually we will provide bi-directional writes.

The measurement using DMA engine for NTB performance measure does
not measure the raw performance of DMA engine over NTB due to software
overhead. But it should provide the peak performance through the Linux DMA
driver.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>