#
345040 |
|
11-Mar-2019 |
jhb |
MFC 318429,318967,319721,319723,323600,323724,328353-328361,330042,343056: Add a driver for the Chelsio T6 crypto accelerator engine.
Note that with the set of commits in this batch, no additional tunables are needed to use the driver once it is loaded.
318429: Add a driver for the Chelsio T6 crypto accelerator engine.
The ccr(4) driver supports use of the crypto accelerator engine on Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.
Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC, and SHA2-512-HMAC authentication algorithms. The driver also supports chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication algorithm for encrypt-then-authenticate operations.
Note that this driver is still under active development and testing and may not yet be ready for production use. It does pass the tests in tests/sys/opencrypto with the exception that the AES-GCM implementation in the driver does not yet support requests with a zero byte payload.
To use this driver currently, the "uwire" configuration must be used along with explicitly enabling support for lookaside crypto capabilities in the cxgbe(4) driver. These can be done by setting the following tunables before loading the cxgbe(4) driver:
hw.cxgbe.config_file=uwire hw.cxgbe.cryptocaps_allowed=-1
318967: Fail large requests with EFBIG.
The adapter firmware in general does not accept PDUs larger than 64k - 1 bytes in size. Sending crypto requests larger than this size result in hangs or incorrect output, so reject them with EFBIG. For requests chaining an AES cipher with an HMAC, the firmware appears to require slightly smaller requests (around 512 bytes).
319721: Add explicit handling for requests with an empty payload.
- For HMAC requests, construct a special input buffer to request an empty hash result. - For plain cipher requests and requests that chain an AES cipher with an HMAC, fail with EINVAL if there is no cipher payload. If needed in the future, chained requests that only contain AAD could be serviced as HMAC-only requests. - For GCM requests, the hardware does not support generating the tag for an AAD-only request. Instead, complete these requests synchronously in software on the assumption that such requests are rare.
319723: Fix the software fallback for GCM to validate the existing tag for decrypts.
323600: Fix some incorrect sysctl pointers for some error stats.
The bad_session, sglist_error, and process_error sysctl nodes were returning the value of the pad_error node instead of the appropriate error counters.
323724: Enable support for lookaside crypto operations by default.
This permits ccr(4) to be used with the default firmware configuration file.
328353: Always store the IV in the immediate portion of a work request.
Combined authentication-encryption and GCM requests already stored the IV in the immediate explicitly. This extends this behavior to block cipher requests to work around a firmware bug. While here, simplify the AEAD and GCM handlers to not include always-true conditions.
328354: Always set the IV location to IV_NOP.
The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work request.
328355: Reject requests with AAD and IV larger than 511 bytes.
The T6 crypto engine's control messages only support a total AAD length (including the prefixed IV) of 511 bytes. Reject requests with large AAD rather than returning incorrect results.
328356: Don't discard AAD and IV output data for AEAD requests.
The T6 can hang when processing certain AEAD requests if the request sets a flag asking the crypto engine to discard the input IV and AAD rather than copying them into the output buffer. The existing driver always discards the IV and AAD as we do not need it. As a workaround, allocate a single "dummy" buffer when the ccr driver attaches and change all AEAD requests to write the IV and AAD to this scratch buffer. The contents of the scratch buffer are never used (similar to "bogus_page"), and it is ok for multiple in-flight requests to share this dummy buffer.
328357: Fail crypto requests when the resulting work request is too large.
Most crypto requests will not trigger this condition, but a request with a highly-fragmented data buffer (and a resulting "large" S/G list) could trigger it.
328358: Clamp DSGL entries to a length of 2KB.
This works around an issue in the T6 that can result in DMA engine stalls if an error occurs while processing a DSGL entry with a length larger than 2KB.
328359: Expand the software fallback for GCM to cover more cases.
- Extend ccr_gcm_soft() to handle requests with a non-empty payload. While here, switch to allocating the GMAC context instead of placing it on the stack since it is over 1KB in size. - Allow ccr_gcm() to return a special error value (EMSGSIZE) which triggers a fallback to ccr_gcm_soft(). Move the existing empty payload check into ccr_gcm() and change a few other cases (e.g. large AAD) to fallback to software via EMSGSIZE as well. - Add a new 'sw_fallback' stat to count the number of requests processed via the software fallback.
328360: Don't read or generate an IV until all error checking is complete.
In particular, this avoids edge cases where a generated IV might be written into the output buffer even though the request is failed with an error.
328361: Store IV in output buffer in GCM software fallback when requested.
Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM software fallback case for encryption requests.
330042: Don't overflow the ipad[] array when clearing the remainder.
After the auth key is copied into the ipad[] array, any remaining bytes are cleared to zero (in case the key is shorter than one block size). The full block size was used as the length of the zero rather than the size of the remaining ipad[]. In practice this overflow was harmless as it could only clear bytes in the following opad[] array which is initialized with a copy of ipad[] in the next statement.
343056: Reject new sessions if the necessary queues aren't initialized.
ccr reuses the control queue and first rx queue from the first port on each adapter. The driver cannot send requests until those queues are initialized. Refuse to create sessions for now if the queues aren't ready. This is a workaround until cxgbe allocates one or more dedicated queues for ccr.
Relnotes: yes Sponsored by: Chelsio Communications
|