All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/5] Add Aspeed crypto driver for hardware acceleration
@ 2022-07-26 11:34 ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Aspeed Hash and Crypto Engine (HACE) is designed to accelerate the
throughput of hash data digest, encryption and decryption.

These patches aim to add Aspeed hash & crypto driver support.
The hash & crypto driver also pass the run-time self tests that
take place at algorithm registration.

The patch series are tested on both AST2500 & AST2600 evaluation boards.

Tested-by below configs:
- CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
- CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
- CONFIG_DMA_API_DEBUG=y
- CONFIG_DMA_API_DEBUG_SG=y
- CONFIG_CPU_BIG_ENDIAN=y

Change since v7:
- Define debug Kconfigs.
- Simplify assign iv/ivsize.
- Simplify cra_init() for hmac related init.

Change since v6:
- Refine debug print.
- Change aspeed_sg_list struct memeber's type to __le32.

Change since v5:
- Re-define HACE clock define to fix breaking ABI.

Change since v4:
- Add AST2500 clock definition & dts node.
- Add software fallback for handling corner cases.
- Fix copy wrong key length.

Change since v3:
- Use dmam_alloc_coherent() instead to manage dma_alloc_coherent().
- Add more error handler of dma_prepare() & crypto_engine_start().

Change since v2:
- Fix endianness issue. Tested on both little endian & big endian
  system.
- Use common crypto hardware engine for enqueue & dequeue requests.
- Use pre-defined IVs for SHA-family.
- Revise error handler flow.
- Fix sorts of coding style problems.

Change since v1:
- Add more error handlers, including DMA memory allocate/free, DMA
  map/unmap, clock enable/disable, etc.
- Fix check dma_map error for config DMA_API_DEBUG.
- Fix dt-binding doc & dts node naming.

Neal Liu (5):
  crypto: aspeed: Add HACE hash driver
  dt-bindings: clock: Add AST2500/AST2600 HACE reset definition
  ARM: dts: aspeed: Add HACE device controller node
  dt-bindings: crypto: add documentation for aspeed hace
  crypto: aspeed: add HACE crypto driver

 .../bindings/crypto/aspeed,ast2500-hace.yaml  |   53 +
 MAINTAINERS                                   |    7 +
 arch/arm/boot/dts/aspeed-g5.dtsi              |    8 +
 arch/arm/boot/dts/aspeed-g6.dtsi              |    8 +
 drivers/crypto/Kconfig                        |    1 +
 drivers/crypto/Makefile                       |    1 +
 drivers/crypto/aspeed/Kconfig                 |   58 +
 drivers/crypto/aspeed/Makefile                |    9 +
 drivers/crypto/aspeed/aspeed-hace-crypto.c    | 1121 +++++++++++++
 drivers/crypto/aspeed/aspeed-hace-hash.c      | 1389 +++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c           |  302 ++++
 drivers/crypto/aspeed/aspeed-hace.h           |  298 ++++
 include/dt-bindings/clock/aspeed-clock.h      |    1 +
 include/dt-bindings/clock/ast2600-clock.h     |    1 +
 14 files changed, 3257 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
 create mode 100644 drivers/crypto/aspeed/Kconfig
 create mode 100644 drivers/crypto/aspeed/Makefile
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v8 0/5] Add Aspeed crypto driver for hardware acceleration
@ 2022-07-26 11:34 ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Aspeed Hash and Crypto Engine (HACE) is designed to accelerate the
throughput of hash data digest, encryption and decryption.

These patches aim to add Aspeed hash & crypto driver support.
The hash & crypto driver also pass the run-time self tests that
take place at algorithm registration.

The patch series are tested on both AST2500 & AST2600 evaluation boards.

Tested-by below configs:
- CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
- CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
- CONFIG_DMA_API_DEBUG=y
- CONFIG_DMA_API_DEBUG_SG=y
- CONFIG_CPU_BIG_ENDIAN=y

Change since v7:
- Define debug Kconfigs.
- Simplify assign iv/ivsize.
- Simplify cra_init() for hmac related init.

Change since v6:
- Refine debug print.
- Change aspeed_sg_list struct memeber's type to __le32.

Change since v5:
- Re-define HACE clock define to fix breaking ABI.

Change since v4:
- Add AST2500 clock definition & dts node.
- Add software fallback for handling corner cases.
- Fix copy wrong key length.

Change since v3:
- Use dmam_alloc_coherent() instead to manage dma_alloc_coherent().
- Add more error handler of dma_prepare() & crypto_engine_start().

Change since v2:
- Fix endianness issue. Tested on both little endian & big endian
  system.
- Use common crypto hardware engine for enqueue & dequeue requests.
- Use pre-defined IVs for SHA-family.
- Revise error handler flow.
- Fix sorts of coding style problems.

Change since v1:
- Add more error handlers, including DMA memory allocate/free, DMA
  map/unmap, clock enable/disable, etc.
- Fix check dma_map error for config DMA_API_DEBUG.
- Fix dt-binding doc & dts node naming.

Neal Liu (5):
  crypto: aspeed: Add HACE hash driver
  dt-bindings: clock: Add AST2500/AST2600 HACE reset definition
  ARM: dts: aspeed: Add HACE device controller node
  dt-bindings: crypto: add documentation for aspeed hace
  crypto: aspeed: add HACE crypto driver

 .../bindings/crypto/aspeed,ast2500-hace.yaml  |   53 +
 MAINTAINERS                                   |    7 +
 arch/arm/boot/dts/aspeed-g5.dtsi              |    8 +
 arch/arm/boot/dts/aspeed-g6.dtsi              |    8 +
 drivers/crypto/Kconfig                        |    1 +
 drivers/crypto/Makefile                       |    1 +
 drivers/crypto/aspeed/Kconfig                 |   58 +
 drivers/crypto/aspeed/Makefile                |    9 +
 drivers/crypto/aspeed/aspeed-hace-crypto.c    | 1121 +++++++++++++
 drivers/crypto/aspeed/aspeed-hace-hash.c      | 1389 +++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c           |  302 ++++
 drivers/crypto/aspeed/aspeed-hace.h           |  298 ++++
 include/dt-bindings/clock/aspeed-clock.h      |    1 +
 include/dt-bindings/clock/ast2600-clock.h     |    1 +
 14 files changed, 3257 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
 create mode 100644 drivers/crypto/aspeed/Kconfig
 create mode 100644 drivers/crypto/aspeed/Makefile
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.h

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-07-26 11:34 ` Neal Liu
@ 2022-07-26 11:34   ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Hash and Crypto Engine (HACE) is designed to accelerate the
throughput of hash data digest, encryption, and decryption.

Basically, HACE can be divided into two independently engines
- Hash Engine and Crypto Engine. This patch aims to add HACE
hash engine driver for hash accelerator.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
---
 MAINTAINERS                              |    7 +
 drivers/crypto/Kconfig                   |    1 +
 drivers/crypto/Makefile                  |    1 +
 drivers/crypto/aspeed/Kconfig            |   32 +
 drivers/crypto/aspeed/Makefile           |    6 +
 drivers/crypto/aspeed/aspeed-hace-hash.c | 1389 ++++++++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
 drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
 8 files changed, 1835 insertions(+)
 create mode 100644 drivers/crypto/aspeed/Kconfig
 create mode 100644 drivers/crypto/aspeed/Makefile
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.h

diff --git a/MAINTAINERS b/MAINTAINERS
index f55aea311af5..23a0215b7e42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3140,6 +3140,13 @@ S:	Maintained
 F:	Documentation/devicetree/bindings/media/aspeed-video.txt
 F:	drivers/media/platform/aspeed/
 
+ASPEED CRYPTO DRIVER
+M:	Neal Liu <neal_liu@aspeedtech.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+S:	Maintained
+F:	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
+F:	drivers/crypto/aspeed/
+
 ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
 M:	Corentin Chary <corentin.chary@gmail.com>
 L:	acpi4asus-user@lists.sourceforge.net
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index ee99c02c84e8..b9f5ee126881 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
 	  acceleration for cryptographic algorithms on these devices.
 
 source "drivers/crypto/keembay/Kconfig"
+source "drivers/crypto/aspeed/Kconfig"
 
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index f81703a86b98..116de173a66c 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
+obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
new file mode 100644
index 000000000000..059e627efef8
--- /dev/null
+++ b/drivers/crypto/aspeed/Kconfig
@@ -0,0 +1,32 @@
+config CRYPTO_DEV_ASPEED
+	tristate "Support for Aspeed cryptographic engine driver"
+	depends on ARCH_ASPEED
+	help
+	  Hash and Crypto Engine (HACE) is designed to accelerate the
+	  throughput of hash data digest, encryption and decryption.
+
+	  Select y here to have support for the cryptographic driver
+	  available on Aspeed SoC.
+
+config CRYPTO_DEV_ASPEED_HACE_HASH
+	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
+	depends on CRYPTO_DEV_ASPEED
+	select CRYPTO_ENGINE
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	select CRYPTO_SHA512
+	select CRYPTO_HMAC
+	help
+	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
+	  hash driver.
+	  Supports multiple message digest standards, including
+	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
+
+config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
+	bool "Enable HACE hash debug messages"
+	depends on CRYPTO_DEV_ASPEED_HACE_HASH
+	help
+	  Print HACE hash debugging messages if you use this option
+	  to ask for those messages.
+	  Avoid enabling this option for production build to
+	  minimize driver timing.
diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
new file mode 100644
index 000000000000..8bc8d4fed5a9
--- /dev/null
+++ b/drivers/crypto/aspeed/Makefile
@@ -0,0 +1,6 @@
+obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
+aspeed_crypto-objs := aspeed-hace.o \
+		      $(hace-hash-y)
+
+obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
+hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c b/drivers/crypto/aspeed/aspeed-hace-hash.c
new file mode 100644
index 000000000000..63a8ad694996
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
@@ -0,0 +1,1389 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include "aspeed-hace.h"
+
+#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
+#define AHASH_DBG(h, fmt, ...)	\
+	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define AHASH_DBG(h, fmt, ...)	\
+	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+/* Initialization Vectors for SHA-family */
+static const __be32 sha1_iv[8] = {
+	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
+	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
+	cpu_to_be32(SHA1_H4), 0, 0, 0
+};
+
+static const __be32 sha224_iv[8] = {
+	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
+	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
+	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
+	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
+};
+
+static const __be32 sha256_iv[8] = {
+	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
+	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
+	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
+	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
+};
+
+static const __be64 sha384_iv[8] = {
+	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
+	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
+	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
+	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
+};
+
+static const __be64 sha512_iv[8] = {
+	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
+	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
+	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
+	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
+};
+
+static const __be32 sha512_224_iv[16] = {
+	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
+	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
+	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
+	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
+	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
+	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
+	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
+	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
+};
+
+static const __be32 sha512_256_iv[16] = {
+	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
+	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
+	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
+	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
+	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
+	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
+	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
+	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
+};
+
+/* The purpose of this padding is to ensure that the padded message is a
+ * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits (SHA384/SHA512).
+ * The bit "1" is appended at the end of the message followed by
+ * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
+ * 128 bits block (SHA384/SHA512) equals to the message length in bits
+ * is appended.
+ *
+ * For SHA1/SHA224/SHA256, padlen is calculated as followed:
+ *  - if message length < 56 bytes then padlen = 56 - message length
+ *  - else padlen = 64 + 56 - message length
+ *
+ * For SHA384/SHA512, padlen is calculated as followed:
+ *  - if message length < 112 bytes then padlen = 112 - message length
+ *  - else padlen = 128 + 112 - message length
+ */
+static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
+				      struct aspeed_sham_reqctx *rctx)
+{
+	unsigned int index, padlen;
+	__be64 bits[2];
+
+	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
+
+	switch (rctx->flags & SHA_FLAGS_MASK) {
+	case SHA_FLAGS_SHA1:
+	case SHA_FLAGS_SHA224:
+	case SHA_FLAGS_SHA256:
+		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
+		index = rctx->bufcnt & 0x3f;
+		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
+		*(rctx->buffer + rctx->bufcnt) = 0x80;
+		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
+		rctx->bufcnt += padlen + 8;
+		break;
+	default:
+		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
+		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
+				      rctx->digcnt[0] >> 61);
+		index = rctx->bufcnt & 0x7f;
+		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
+		*(rctx->buffer + rctx->bufcnt) = 0x80;
+		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
+		rctx->bufcnt += padlen + 16;
+		break;
+	}
+}
+
+/*
+ * Prepare DMA buffer before hardware engine
+ * processing.
+ */
+static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	int length, remain;
+
+	length = rctx->total + rctx->bufcnt;
+	remain = length % rctx->block_size;
+
+	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
+
+	if (rctx->bufcnt)
+		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
+
+	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
+		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
+					 rctx->bufcnt, rctx->src_sg,
+					 rctx->offset, rctx->total - remain, 0);
+		rctx->offset += rctx->total - remain;
+
+	} else {
+		dev_warn(hace_dev->dev, "Hash data length is too large\n");
+		return -EINVAL;
+	}
+
+	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
+				 rctx->offset, remain, 0);
+
+	rctx->bufcnt = remain;
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		return -ENOMEM;
+	}
+
+	hash_engine->src_length = length - remain;
+	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	return 0;
+}
+
+/*
+ * Prepare DMA buffer as SG list buffer before
+ * hardware engine processing.
+ */
+static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct aspeed_sg_list *src_list;
+	struct scatterlist *s;
+	int length, remain, sg_len, i;
+	int rc = 0;
+
+	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
+	length = rctx->total + rctx->bufcnt - remain;
+
+	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
+		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
+		  "length", length, "remain", remain);
+
+	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+			    DMA_TO_DEVICE);
+	if (!sg_len) {
+		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto free_src_sg;
+	}
+
+	if (rctx->bufcnt != 0) {
+		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
+						       rctx->buffer,
+						       rctx->block_size * 2,
+						       DMA_TO_DEVICE);
+		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+			rc = -ENOMEM;
+			goto free_rctx_digest;
+		}
+
+		src_list[0].phy_addr = rctx->buffer_dma_addr;
+		src_list[0].len = rctx->bufcnt;
+		length -= src_list[0].len;
+
+		/* Last sg list */
+		if (length == 0)
+			src_list[0].len |= HASH_SG_LAST_LIST;
+
+		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
+		src_list[0].len = cpu_to_le32(src_list[0].len);
+		src_list++;
+	}
+
+	if (length != 0) {
+		for_each_sg(rctx->src_sg, s, sg_len, i) {
+			src_list[i].phy_addr = sg_dma_address(s);
+
+			if (length > sg_dma_len(s)) {
+				src_list[i].len = sg_dma_len(s);
+				length -= sg_dma_len(s);
+
+			} else {
+				/* Last sg list */
+				src_list[i].len = length;
+				src_list[i].len |= HASH_SG_LAST_LIST;
+				length = 0;
+			}
+
+			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
+			src_list[i].len = cpu_to_le32(src_list[i].len);
+		}
+	}
+
+	if (length != 0) {
+		rc = -EINVAL;
+		goto free_rctx_buffer;
+	}
+
+	rctx->offset = rctx->total - remain;
+	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
+	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	goto end;
+
+free_rctx_buffer:
+	if (rctx->bufcnt != 0)
+		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+				 rctx->block_size * 2, DMA_TO_DEVICE);
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+free_src_sg:
+	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+		     DMA_TO_DEVICE);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
+
+	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
+
+	return 0;
+}
+
+/*
+ * Copy digest to the corresponding request result.
+ * This function will be called at final() stage.
+ */
+static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+			 rctx->block_size * 2, DMA_TO_DEVICE);
+
+	memcpy(req->result, rctx->digest, rctx->digsize);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+/*
+ * Trigger hardware engines to do the math.
+ */
+static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
+				     aspeed_hace_fn_t resume)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x, length:0x%x\n",
+		  hash_engine->src_dma, hash_engine->digest_dma,
+		  hash_engine->src_length);
+
+	rctx->cmd |= HASH_CMD_INT_ENABLE;
+	hash_engine->resume = resume;
+
+	ast_hace_write(hace_dev, hash_engine->src_dma, ASPEED_HACE_HASH_SRC);
+	ast_hace_write(hace_dev, hash_engine->digest_dma,
+		       ASPEED_HACE_HASH_DIGEST_BUFF);
+	ast_hace_write(hace_dev, hash_engine->digest_dma,
+		       ASPEED_HACE_HASH_KEY_BUFF);
+	ast_hace_write(hace_dev, hash_engine->src_length,
+		       ASPEED_HACE_HASH_DATA_LEN);
+
+	/* Memory barrier to ensure all data setup before engine starts */
+	mb();
+
+	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
+
+	return -EINPROGRESS;
+}
+
+/*
+ * HMAC resume aims to do the second pass produces
+ * the final HMAC code derived from the inner hash
+ * result and the outer key.
+ */
+static int aspeed_ahash_hmac_resume(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+	int rc = 0;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+			 rctx->block_size * 2, DMA_TO_DEVICE);
+
+	/* o key pad + hash sum 1 */
+	memcpy(rctx->buffer, bctx->opad, rctx->block_size);
+	memcpy(rctx->buffer + rctx->block_size, rctx->digest, rctx->digsize);
+
+	rctx->bufcnt = rctx->block_size + rctx->digsize;
+	rctx->digcnt[0] = rctx->block_size + rctx->digsize;
+
+	aspeed_ahash_fill_padding(hace_dev, rctx);
+	memcpy(rctx->digest, rctx->sha_iv, rctx->ivsize);
+
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev, rctx->buffer,
+					       rctx->block_size * 2,
+					       DMA_TO_DEVICE);
+	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+		rc = -ENOMEM;
+		goto free_rctx_digest;
+	}
+
+	hash_engine->src_dma = rctx->buffer_dma_addr;
+	hash_engine->src_length = rctx->bufcnt;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
+
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_req_final(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	int rc = 0;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	aspeed_ahash_fill_padding(hace_dev, rctx);
+
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
+					       rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
+					       rctx->buffer,
+					       rctx->block_size * 2,
+					       DMA_TO_DEVICE);
+	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+		rc = -ENOMEM;
+		goto free_rctx_digest;
+	}
+
+	hash_engine->src_dma = rctx->buffer_dma_addr;
+	hash_engine->src_length = rctx->bufcnt;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	if (rctx->flags & SHA_FLAGS_HMAC)
+		return aspeed_hace_ahash_trigger(hace_dev,
+						 aspeed_ahash_hmac_resume);
+
+	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
+
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_update_resume_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+		     DMA_TO_DEVICE);
+
+	if (rctx->bufcnt != 0)
+		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+				 rctx->block_size * 2,
+				 DMA_TO_DEVICE);
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg, rctx->offset,
+				 rctx->total - rctx->offset, 0);
+
+	rctx->bufcnt = rctx->total - rctx->offset;
+	rctx->cmd &= ~HASH_CMD_HASH_SRC_SG_CTRL;
+
+	if (rctx->flags & SHA_FLAGS_FINUP)
+		return aspeed_ahash_req_final(hace_dev);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+static int aspeed_ahash_update_resume(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	if (rctx->flags & SHA_FLAGS_FINUP)
+		return aspeed_ahash_req_final(hace_dev);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+static int aspeed_ahash_req_update(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	aspeed_hace_fn_t resume;
+	int ret;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	if (hace_dev->version == AST2600_VERSION) {
+		rctx->cmd |= HASH_CMD_HASH_SRC_SG_CTRL;
+		resume = aspeed_ahash_update_resume_sg;
+
+	} else {
+		resume = aspeed_ahash_update_resume;
+	}
+
+	ret = hash_engine->dma_prepare(hace_dev);
+	if (ret)
+		return ret;
+
+	return aspeed_hace_ahash_trigger(hace_dev, resume);
+}
+
+static int aspeed_hace_hash_handle_queue(struct aspeed_hace_dev *hace_dev,
+				  struct ahash_request *req)
+{
+	return crypto_transfer_hash_request_to_engine(
+			hace_dev->crypt_engine_hash, req);
+}
+
+static int aspeed_ahash_do_request(struct crypto_engine *engine, void *areq)
+{
+	struct ahash_request *req = ahash_request_cast(areq);
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_engine_hash *hash_engine;
+	int ret = 0;
+
+	hash_engine = &hace_dev->hash_engine;
+	hash_engine->flags |= CRYPTO_FLAGS_BUSY;
+
+	if (rctx->op == SHA_OP_UPDATE)
+		ret = aspeed_ahash_req_update(hace_dev);
+	else if (rctx->op == SHA_OP_FINAL)
+		ret = aspeed_ahash_req_final(hace_dev);
+
+	if (ret != -EINPROGRESS)
+		return ret;
+
+	return 0;
+}
+
+static int aspeed_ahash_prepare_request(struct crypto_engine *engine,
+					void *areq)
+{
+	struct ahash_request *req = ahash_request_cast(areq);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_engine_hash *hash_engine;
+
+	hash_engine = &hace_dev->hash_engine;
+	hash_engine->req = req;
+
+	if (hace_dev->version == AST2600_VERSION)
+		hash_engine->dma_prepare = aspeed_ahash_dma_prepare_sg;
+	else
+		hash_engine->dma_prepare = aspeed_ahash_dma_prepare;
+
+	return 0;
+}
+
+static int aspeed_sham_update(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
+
+	rctx->total = req->nbytes;
+	rctx->src_sg = req->src;
+	rctx->offset = 0;
+	rctx->src_nents = sg_nents(req->src);
+	rctx->op = SHA_OP_UPDATE;
+
+	rctx->digcnt[0] += rctx->total;
+	if (rctx->digcnt[0] < rctx->total)
+		rctx->digcnt[1]++;
+
+	if (rctx->bufcnt + rctx->total < rctx->block_size) {
+		scatterwalk_map_and_copy(rctx->buffer + rctx->bufcnt,
+					 rctx->src_sg, rctx->offset,
+					 rctx->total, 0);
+		rctx->bufcnt += rctx->total;
+
+		return 0;
+	}
+
+	return aspeed_hace_hash_handle_queue(hace_dev, req);
+}
+
+static int aspeed_sham_shash_digest(struct crypto_shash *tfm, u32 flags,
+				    const u8 *data, unsigned int len, u8 *out)
+{
+	SHASH_DESC_ON_STACK(shash, tfm);
+
+	shash->tfm = tfm;
+
+	return crypto_shash_digest(shash, data, len, out);
+}
+
+static int aspeed_sham_final(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "req->nbytes:%d, rctx->total:%d\n",
+		  req->nbytes, rctx->total);
+	rctx->op = SHA_OP_FINAL;
+
+	return aspeed_hace_hash_handle_queue(hace_dev, req);
+}
+
+static int aspeed_sham_finup(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	int rc1, rc2;
+
+	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
+
+	rctx->flags |= SHA_FLAGS_FINUP;
+
+	rc1 = aspeed_sham_update(req);
+	if (rc1 == -EINPROGRESS || rc1 == -EBUSY)
+		return rc1;
+
+	/*
+	 * final() has to be always called to cleanup resources
+	 * even if update() failed, except EINPROGRESS
+	 */
+	rc2 = aspeed_sham_final(req);
+
+	return rc1 ? : rc2;
+}
+
+static int aspeed_sham_init(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+	AHASH_DBG(hace_dev, "%s: digest size:%d\n",
+		  crypto_tfm_alg_name(&tfm->base),
+		  crypto_ahash_digestsize(tfm));
+
+	rctx->cmd = HASH_CMD_ACC_MODE;
+	rctx->flags = 0;
+
+	switch (crypto_ahash_digestsize(tfm)) {
+	case SHA1_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA1 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA1;
+		rctx->digsize = SHA1_DIGEST_SIZE;
+		rctx->block_size = SHA1_BLOCK_SIZE;
+		rctx->sha_iv = sha1_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha1_iv, rctx->ivsize);
+		break;
+	case SHA224_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA224 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA224;
+		rctx->digsize = SHA224_DIGEST_SIZE;
+		rctx->block_size = SHA224_BLOCK_SIZE;
+		rctx->sha_iv = sha224_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha224_iv, rctx->ivsize);
+		break;
+	case SHA256_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA256 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA256;
+		rctx->digsize = SHA256_DIGEST_SIZE;
+		rctx->block_size = SHA256_BLOCK_SIZE;
+		rctx->sha_iv = sha256_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha256_iv, rctx->ivsize);
+		break;
+	case SHA384_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA384 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA384;
+		rctx->digsize = SHA384_DIGEST_SIZE;
+		rctx->block_size = SHA384_BLOCK_SIZE;
+		rctx->sha_iv = (const __be32 *)sha384_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha384_iv, rctx->ivsize);
+		break;
+	case SHA512_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512;
+		rctx->digsize = SHA512_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = (const __be32 *)sha512_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_iv, rctx->ivsize);
+		break;
+	default:
+		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
+			 crypto_ahash_digestsize(tfm));
+		return -EINVAL;
+	}
+
+	rctx->bufcnt = 0;
+	rctx->total = 0;
+	rctx->digcnt[0] = 0;
+	rctx->digcnt[1] = 0;
+
+	/* HMAC init */
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		rctx->digcnt[0] = rctx->block_size;
+		rctx->bufcnt = rctx->block_size;
+		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
+		rctx->flags |= SHA_FLAGS_HMAC;
+	}
+
+	return 0;
+}
+
+static int aspeed_sha512s_init(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+	AHASH_DBG(hace_dev, "digest size: %d\n", crypto_ahash_digestsize(tfm));
+
+	rctx->cmd = HASH_CMD_ACC_MODE;
+	rctx->flags = 0;
+
+	switch (crypto_ahash_digestsize(tfm)) {
+	case SHA224_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_224 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512_224;
+		rctx->digsize = SHA224_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = sha512_224_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_224_iv, rctx->ivsize);
+		break;
+	case SHA256_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_256 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512_256;
+		rctx->digsize = SHA256_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = sha512_256_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_256_iv, rctx->ivsize);
+		break;
+	default:
+		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
+			 crypto_ahash_digestsize(tfm));
+		return -EINVAL;
+	}
+
+	rctx->bufcnt = 0;
+	rctx->total = 0;
+	rctx->digcnt[0] = 0;
+	rctx->digcnt[1] = 0;
+
+	/* HMAC init */
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		rctx->digcnt[0] = rctx->block_size;
+		rctx->bufcnt = rctx->block_size;
+		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
+		rctx->flags |= SHA_FLAGS_HMAC;
+	}
+
+	return 0;
+}
+
+static int aspeed_sham_digest(struct ahash_request *req)
+{
+	return aspeed_sham_init(req) ? : aspeed_sham_finup(req);
+}
+
+static int aspeed_sham_setkey(struct crypto_ahash *tfm, const u8 *key,
+			      unsigned int keylen)
+{
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+	int ds = crypto_shash_digestsize(bctx->shash);
+	int bs = crypto_shash_blocksize(bctx->shash);
+	int err = 0;
+	int i;
+
+	AHASH_DBG(hace_dev, "%s: keylen:%d\n", crypto_tfm_alg_name(&tfm->base),
+		  keylen);
+
+	if (keylen > bs) {
+		err = aspeed_sham_shash_digest(bctx->shash,
+					       crypto_shash_get_flags(bctx->shash),
+					       key, keylen, bctx->ipad);
+		if (err)
+			return err;
+		keylen = ds;
+
+	} else {
+		memcpy(bctx->ipad, key, keylen);
+	}
+
+	memset(bctx->ipad + keylen, 0, bs - keylen);
+	memcpy(bctx->opad, bctx->ipad, bs);
+
+	for (i = 0; i < bs; i++) {
+		bctx->ipad[i] ^= HMAC_IPAD_VALUE;
+		bctx->opad[i] ^= HMAC_OPAD_VALUE;
+	}
+
+	return err;
+}
+
+static int aspeed_sham_cra_init(struct crypto_tfm *tfm)
+{
+	struct ahash_alg *alg = __crypto_ahash_alg(tfm->__crt_alg);
+	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
+	struct aspeed_hace_alg *ast_alg;
+
+	ast_alg = container_of(alg, struct aspeed_hace_alg, alg.ahash);
+	tctx->hace_dev = ast_alg->hace_dev;
+	tctx->flags = 0;
+
+	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
+				 sizeof(struct aspeed_sham_reqctx));
+
+	if (ast_alg->alg_base) {
+		/* hmac related */
+		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+		tctx->flags |= SHA_FLAGS_HMAC;
+		bctx->shash = crypto_alloc_shash(ast_alg->alg_base, 0,
+						 CRYPTO_ALG_NEED_FALLBACK);
+		if (IS_ERR(bctx->shash)) {
+			dev_warn(ast_alg->hace_dev->dev,
+				 "base driver '%s' could not be loaded.\n",
+				 ast_alg->alg_base);
+			return PTR_ERR(bctx->shash);
+		}
+	}
+
+	tctx->enginectx.op.do_one_request = aspeed_ahash_do_request;
+	tctx->enginectx.op.prepare_request = aspeed_ahash_prepare_request;
+	tctx->enginectx.op.unprepare_request = NULL;
+
+	return 0;
+}
+
+static void aspeed_sham_cra_exit(struct crypto_tfm *tfm)
+{
+	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(tfm));
+
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+		crypto_free_shash(bctx->shash);
+	}
+}
+
+static int aspeed_sham_export(struct ahash_request *req, void *out)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	memcpy(out, rctx, sizeof(*rctx));
+
+	return 0;
+}
+
+static int aspeed_sham_import(struct ahash_request *req, const void *in)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	memcpy(rctx, in, sizeof(*rctx));
+
+	return 0;
+}
+
+struct aspeed_hace_alg aspeed_ahash_algs[] = {
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA1_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha1",
+					.cra_driver_name	= "aspeed-sha1",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA1_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha256",
+					.cra_driver_name	= "aspeed-sha256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA256_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha224",
+					.cra_driver_name	= "aspeed-sha224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA224_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha1",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA1_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha1)",
+					.cra_driver_name	= "aspeed-hmac-sha1",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA1_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha224",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha224)",
+					.cra_driver_name	= "aspeed-hmac-sha224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA224_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha256",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha256)",
+					.cra_driver_name	= "aspeed-hmac-sha256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA256_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+};
+
+struct aspeed_hace_alg aspeed_ahash_algs_g6[] = {
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA384_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha384",
+					.cra_driver_name	= "aspeed-sha384",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA384_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA512_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512",
+					.cra_driver_name	= "aspeed-sha512",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512_224",
+					.cra_driver_name	= "aspeed-sha512_224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512_256",
+					.cra_driver_name	= "aspeed-sha512_256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha384",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA384_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha384)",
+					.cra_driver_name	= "aspeed-hmac-sha384",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA384_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA512_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512)",
+					.cra_driver_name	= "aspeed-hmac-sha512",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512_224",
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512_224)",
+					.cra_driver_name	= "aspeed-hmac-sha512_224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512_256",
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512_256)",
+					.cra_driver_name	= "aspeed-hmac-sha512_256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+};
+
+void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++)
+		crypto_unregister_ahash(&aspeed_ahash_algs[i].alg.ahash);
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++)
+		crypto_unregister_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
+}
+
+void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int rc, i;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++) {
+		aspeed_ahash_algs[i].hace_dev = hace_dev;
+		rc = crypto_register_ahash(&aspeed_ahash_algs[i].alg.ahash);
+		if (rc) {
+			AHASH_DBG(hace_dev, "Failed to register %s\n",
+				  aspeed_ahash_algs[i].alg.ahash.halg.base.cra_name);
+		}
+	}
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++) {
+		aspeed_ahash_algs_g6[i].hace_dev = hace_dev;
+		rc = crypto_register_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
+		if (rc) {
+			AHASH_DBG(hace_dev, "Failed to register %s\n",
+				  aspeed_ahash_algs_g6[i].alg.ahash.halg.base.cra_name);
+		}
+	}
+}
diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
new file mode 100644
index 000000000000..89b1585d72e2
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include <linux/clk.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_irq.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#include "aspeed-hace.h"
+
+#ifdef ASPEED_HACE_DEBUG
+#define HACE_DBG(d, fmt, ...)	\
+	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define HACE_DBG(d, fmt, ...)	\
+	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+/* Weak function for HACE hash */
+void __weak aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+/* HACE interrupt service routine */
+static irqreturn_t aspeed_hace_irq(int irq, void *dev)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	u32 sts;
+
+	sts = ast_hace_read(hace_dev, ASPEED_HACE_STS);
+	ast_hace_write(hace_dev, sts, ASPEED_HACE_STS);
+
+	HACE_DBG(hace_dev, "irq status: 0x%x\n", sts);
+
+	if (sts & HACE_HASH_ISR) {
+		if (hash_engine->flags & CRYPTO_FLAGS_BUSY)
+			tasklet_schedule(&hash_engine->done_task);
+		else
+			dev_warn(hace_dev->dev, "HASH no active requests.\n");
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void aspeed_hace_hash_done_task(unsigned long data)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+
+	hash_engine->resume(hace_dev);
+}
+
+static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
+{
+	aspeed_register_hace_hash_algs(hace_dev);
+}
+
+static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
+{
+	aspeed_unregister_hace_hash_algs(hace_dev);
+}
+
+static const struct of_device_id aspeed_hace_of_matches[] = {
+	{ .compatible = "aspeed,ast2500-hace", .data = (void *)5, },
+	{ .compatible = "aspeed,ast2600-hace", .data = (void *)6, },
+	{},
+};
+
+static int aspeed_hace_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *hace_dev_id;
+	struct aspeed_engine_hash *hash_engine;
+	struct aspeed_hace_dev *hace_dev;
+	struct resource *res;
+	int rc;
+
+	hace_dev = devm_kzalloc(&pdev->dev, sizeof(struct aspeed_hace_dev),
+				GFP_KERNEL);
+	if (!hace_dev)
+		return -ENOMEM;
+
+	hace_dev_id = of_match_device(aspeed_hace_of_matches, &pdev->dev);
+	if (!hace_dev_id) {
+		dev_err(&pdev->dev, "Failed to match hace dev id\n");
+		return -EINVAL;
+	}
+
+	hace_dev->dev = &pdev->dev;
+	hace_dev->version = (unsigned long)hace_dev_id->data;
+	hash_engine = &hace_dev->hash_engine;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+
+	platform_set_drvdata(pdev, hace_dev);
+
+	hace_dev->regs = devm_ioremap_resource(&pdev->dev, res);
+	if (!hace_dev->regs) {
+		dev_err(&pdev->dev, "Failed to map resources\n");
+		return -ENOMEM;
+	}
+
+	/* Get irq number and register it */
+	hace_dev->irq = platform_get_irq(pdev, 0);
+	if (!hace_dev->irq) {
+		dev_err(&pdev->dev, "Failed to get interrupt\n");
+		return -ENXIO;
+	}
+
+	rc = devm_request_irq(&pdev->dev, hace_dev->irq, aspeed_hace_irq, 0,
+			      dev_name(&pdev->dev), hace_dev);
+	if (rc) {
+		dev_err(&pdev->dev, "Failed to request interrupt\n");
+		return rc;
+	}
+
+	/* Get clk and enable it */
+	hace_dev->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(hace_dev->clk)) {
+		dev_err(&pdev->dev, "Failed to get clk\n");
+		return -ENODEV;
+	}
+
+	rc = clk_prepare_enable(hace_dev->clk);
+	if (rc) {
+		dev_err(&pdev->dev, "Failed to enable clock 0x%x\n", rc);
+		return rc;
+	}
+
+	/* Initialize crypto hardware engine structure for hash */
+	hace_dev->crypt_engine_hash = crypto_engine_alloc_init(hace_dev->dev,
+							       true);
+	if (!hace_dev->crypt_engine_hash) {
+		rc = -ENOMEM;
+		goto clk_exit;
+	}
+
+	rc = crypto_engine_start(hace_dev->crypt_engine_hash);
+	if (rc)
+		goto err_engine_hash_start;
+
+	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
+		     (unsigned long)hace_dev);
+
+	/* Allocate DMA buffer for hash engine input used */
+	hash_engine->ahash_src_addr =
+		dmam_alloc_coherent(&pdev->dev,
+				    ASPEED_HASH_SRC_DMA_BUF_LEN,
+				    &hash_engine->ahash_src_dma_addr,
+				    GFP_KERNEL);
+	if (!hash_engine->ahash_src_addr) {
+		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
+		rc = -ENOMEM;
+		goto err_engine_hash_start;
+	}
+
+	aspeed_hace_register(hace_dev);
+
+	dev_info(&pdev->dev, "Aspeed Crypto Accelerator successfully registered\n");
+
+	return 0;
+
+err_engine_hash_start:
+	crypto_engine_exit(hace_dev->crypt_engine_hash);
+clk_exit:
+	clk_disable_unprepare(hace_dev->clk);
+
+	return rc;
+}
+
+static int aspeed_hace_remove(struct platform_device *pdev)
+{
+	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+
+	aspeed_hace_unregister(hace_dev);
+
+	crypto_engine_exit(hace_dev->crypt_engine_hash);
+
+	tasklet_kill(&hash_engine->done_task);
+
+	clk_disable_unprepare(hace_dev->clk);
+
+	return 0;
+}
+
+MODULE_DEVICE_TABLE(of, aspeed_hace_of_matches);
+
+static struct platform_driver aspeed_hace_driver = {
+	.probe		= aspeed_hace_probe,
+	.remove		= aspeed_hace_remove,
+	.driver         = {
+		.name   = KBUILD_MODNAME,
+		.of_match_table = aspeed_hace_of_matches,
+	},
+};
+
+module_platform_driver(aspeed_hace_driver);
+
+MODULE_AUTHOR("Neal Liu <neal_liu@aspeedtech.com>");
+MODULE_DESCRIPTION("Aspeed HACE driver Crypto Accelerator");
+MODULE_LICENSE("GPL");
diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
new file mode 100644
index 000000000000..3494ff22f69d
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace.h
@@ -0,0 +1,186 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+#ifndef __ASPEED_HACE_H__
+#define __ASPEED_HACE_H__
+
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/fips.h>
+#include <linux/dma-mapping.h>
+#include <crypto/scatterwalk.h>
+#include <crypto/internal/aead.h>
+#include <crypto/internal/akcipher.h>
+#include <crypto/internal/hash.h>
+#include <crypto/internal/kpp.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/algapi.h>
+#include <crypto/engine.h>
+#include <crypto/hmac.h>
+#include <crypto/sha1.h>
+#include <crypto/sha2.h>
+
+/*****************************
+ *                           *
+ * HACE register definitions *
+ *                           *
+ * ***************************/
+
+#define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
+#define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
+#define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
+#define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
+#define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
+#define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
+
+/* interrupt status reg */
+#define  HACE_HASH_ISR			BIT(9)
+#define  HACE_HASH_BUSY			BIT(0)
+
+/* hash cmd reg */
+#define  HASH_CMD_MBUS_REQ_SYNC_EN	BIT(20)
+#define  HASH_CMD_HASH_SRC_SG_CTRL	BIT(18)
+#define  HASH_CMD_SHA512_224		(0x3 << 10)
+#define  HASH_CMD_SHA512_256		(0x2 << 10)
+#define  HASH_CMD_SHA384		(0x1 << 10)
+#define  HASH_CMD_SHA512		(0)
+#define  HASH_CMD_INT_ENABLE		BIT(9)
+#define  HASH_CMD_HMAC			(0x1 << 7)
+#define  HASH_CMD_ACC_MODE		(0x2 << 7)
+#define  HASH_CMD_HMAC_KEY		(0x3 << 7)
+#define  HASH_CMD_SHA1			(0x2 << 4)
+#define  HASH_CMD_SHA224		(0x4 << 4)
+#define  HASH_CMD_SHA256		(0x5 << 4)
+#define  HASH_CMD_SHA512_SER		(0x6 << 4)
+#define  HASH_CMD_SHA_SWAP		(0x2 << 2)
+
+#define HASH_SG_LAST_LIST		BIT(31)
+
+#define CRYPTO_FLAGS_BUSY		BIT(1)
+
+#define SHA_OP_UPDATE			1
+#define SHA_OP_FINAL			2
+
+#define SHA_FLAGS_SHA1			BIT(0)
+#define SHA_FLAGS_SHA224		BIT(1)
+#define SHA_FLAGS_SHA256		BIT(2)
+#define SHA_FLAGS_SHA384		BIT(3)
+#define SHA_FLAGS_SHA512		BIT(4)
+#define SHA_FLAGS_SHA512_224		BIT(5)
+#define SHA_FLAGS_SHA512_256		BIT(6)
+#define SHA_FLAGS_HMAC			BIT(8)
+#define SHA_FLAGS_FINUP			BIT(9)
+#define SHA_FLAGS_MASK			(0xff)
+
+#define ASPEED_CRYPTO_SRC_DMA_BUF_LEN	0xa000
+#define ASPEED_CRYPTO_DST_DMA_BUF_LEN	0xa000
+#define ASPEED_CRYPTO_GCM_TAG_OFFSET	0x9ff0
+#define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
+#define ASPEED_HASH_QUEUE_LENGTH	50
+
+struct aspeed_hace_dev;
+
+typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
+
+struct aspeed_sg_list {
+	__le32 len;
+	__le32 phy_addr;
+};
+
+struct aspeed_engine_hash {
+	struct tasklet_struct		done_task;
+	unsigned long			flags;
+	struct ahash_request		*req;
+
+	/* input buffer */
+	void				*ahash_src_addr;
+	dma_addr_t			ahash_src_dma_addr;
+
+	dma_addr_t			src_dma;
+	dma_addr_t			digest_dma;
+
+	size_t				src_length;
+
+	/* callback func */
+	aspeed_hace_fn_t		resume;
+	aspeed_hace_fn_t		dma_prepare;
+};
+
+struct aspeed_sha_hmac_ctx {
+	struct crypto_shash *shash;
+	u8 ipad[SHA512_BLOCK_SIZE];
+	u8 opad[SHA512_BLOCK_SIZE];
+};
+
+struct aspeed_sham_ctx {
+	struct crypto_engine_ctx	enginectx;
+
+	struct aspeed_hace_dev		*hace_dev;
+	unsigned long			flags;	/* hmac flag */
+
+	struct aspeed_sha_hmac_ctx	base[0];
+};
+
+struct aspeed_sham_reqctx {
+	unsigned long		flags;		/* final update flag should no use*/
+	unsigned long		op;		/* final or update */
+	u32			cmd;		/* trigger cmd */
+
+	/* walk state */
+	struct scatterlist	*src_sg;
+	int			src_nents;
+	unsigned int		offset;		/* offset in current sg */
+	unsigned int		total;		/* per update length */
+
+	size_t			digsize;
+	size_t			block_size;
+	size_t			ivsize;
+	const __be32		*sha_iv;
+
+	/* remain data buffer */
+	u8			buffer[SHA512_BLOCK_SIZE * 2];
+	dma_addr_t		buffer_dma_addr;
+	size_t			bufcnt;		/* buffer counter */
+
+	/* output buffer */
+	u8			digest[SHA512_DIGEST_SIZE] __aligned(64);
+	dma_addr_t		digest_dma_addr;
+	u64			digcnt[2];
+};
+
+struct aspeed_hace_dev {
+	void __iomem			*regs;
+	struct device			*dev;
+	int				irq;
+	struct clk			*clk;
+	unsigned long			version;
+
+	struct crypto_engine		*crypt_engine_hash;
+
+	struct aspeed_engine_hash	hash_engine;
+};
+
+struct aspeed_hace_alg {
+	struct aspeed_hace_dev		*hace_dev;
+
+	const char			*alg_base;
+
+	union {
+		struct skcipher_alg	skcipher;
+		struct ahash_alg	ahash;
+	} alg;
+};
+
+enum aspeed_version {
+	AST2500_VERSION = 5,
+	AST2600_VERSION
+};
+
+#define ast_hace_write(hace, val, offset)	\
+	writel((val), (hace)->regs + (offset))
+#define ast_hace_read(hace, offset)		\
+	readl((hace)->regs + (offset))
+
+void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-07-26 11:34   ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Hash and Crypto Engine (HACE) is designed to accelerate the
throughput of hash data digest, encryption, and decryption.

Basically, HACE can be divided into two independently engines
- Hash Engine and Crypto Engine. This patch aims to add HACE
hash engine driver for hash accelerator.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
---
 MAINTAINERS                              |    7 +
 drivers/crypto/Kconfig                   |    1 +
 drivers/crypto/Makefile                  |    1 +
 drivers/crypto/aspeed/Kconfig            |   32 +
 drivers/crypto/aspeed/Makefile           |    6 +
 drivers/crypto/aspeed/aspeed-hace-hash.c | 1389 ++++++++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
 drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
 8 files changed, 1835 insertions(+)
 create mode 100644 drivers/crypto/aspeed/Kconfig
 create mode 100644 drivers/crypto/aspeed/Makefile
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
 create mode 100644 drivers/crypto/aspeed/aspeed-hace.h

diff --git a/MAINTAINERS b/MAINTAINERS
index f55aea311af5..23a0215b7e42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3140,6 +3140,13 @@ S:	Maintained
 F:	Documentation/devicetree/bindings/media/aspeed-video.txt
 F:	drivers/media/platform/aspeed/
 
+ASPEED CRYPTO DRIVER
+M:	Neal Liu <neal_liu@aspeedtech.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+S:	Maintained
+F:	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
+F:	drivers/crypto/aspeed/
+
 ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
 M:	Corentin Chary <corentin.chary@gmail.com>
 L:	acpi4asus-user@lists.sourceforge.net
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index ee99c02c84e8..b9f5ee126881 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
 	  acceleration for cryptographic algorithms on these devices.
 
 source "drivers/crypto/keembay/Kconfig"
+source "drivers/crypto/aspeed/Kconfig"
 
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index f81703a86b98..116de173a66c 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
+obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
new file mode 100644
index 000000000000..059e627efef8
--- /dev/null
+++ b/drivers/crypto/aspeed/Kconfig
@@ -0,0 +1,32 @@
+config CRYPTO_DEV_ASPEED
+	tristate "Support for Aspeed cryptographic engine driver"
+	depends on ARCH_ASPEED
+	help
+	  Hash and Crypto Engine (HACE) is designed to accelerate the
+	  throughput of hash data digest, encryption and decryption.
+
+	  Select y here to have support for the cryptographic driver
+	  available on Aspeed SoC.
+
+config CRYPTO_DEV_ASPEED_HACE_HASH
+	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
+	depends on CRYPTO_DEV_ASPEED
+	select CRYPTO_ENGINE
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	select CRYPTO_SHA512
+	select CRYPTO_HMAC
+	help
+	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
+	  hash driver.
+	  Supports multiple message digest standards, including
+	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
+
+config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
+	bool "Enable HACE hash debug messages"
+	depends on CRYPTO_DEV_ASPEED_HACE_HASH
+	help
+	  Print HACE hash debugging messages if you use this option
+	  to ask for those messages.
+	  Avoid enabling this option for production build to
+	  minimize driver timing.
diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
new file mode 100644
index 000000000000..8bc8d4fed5a9
--- /dev/null
+++ b/drivers/crypto/aspeed/Makefile
@@ -0,0 +1,6 @@
+obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
+aspeed_crypto-objs := aspeed-hace.o \
+		      $(hace-hash-y)
+
+obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
+hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c b/drivers/crypto/aspeed/aspeed-hace-hash.c
new file mode 100644
index 000000000000..63a8ad694996
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
@@ -0,0 +1,1389 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include "aspeed-hace.h"
+
+#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
+#define AHASH_DBG(h, fmt, ...)	\
+	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define AHASH_DBG(h, fmt, ...)	\
+	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+/* Initialization Vectors for SHA-family */
+static const __be32 sha1_iv[8] = {
+	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
+	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
+	cpu_to_be32(SHA1_H4), 0, 0, 0
+};
+
+static const __be32 sha224_iv[8] = {
+	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
+	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
+	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
+	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
+};
+
+static const __be32 sha256_iv[8] = {
+	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
+	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
+	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
+	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
+};
+
+static const __be64 sha384_iv[8] = {
+	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
+	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
+	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
+	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
+};
+
+static const __be64 sha512_iv[8] = {
+	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
+	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
+	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
+	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
+};
+
+static const __be32 sha512_224_iv[16] = {
+	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
+	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
+	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
+	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
+	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
+	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
+	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
+	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
+};
+
+static const __be32 sha512_256_iv[16] = {
+	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
+	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
+	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
+	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
+	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
+	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
+	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
+	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
+};
+
+/* The purpose of this padding is to ensure that the padded message is a
+ * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits (SHA384/SHA512).
+ * The bit "1" is appended at the end of the message followed by
+ * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
+ * 128 bits block (SHA384/SHA512) equals to the message length in bits
+ * is appended.
+ *
+ * For SHA1/SHA224/SHA256, padlen is calculated as followed:
+ *  - if message length < 56 bytes then padlen = 56 - message length
+ *  - else padlen = 64 + 56 - message length
+ *
+ * For SHA384/SHA512, padlen is calculated as followed:
+ *  - if message length < 112 bytes then padlen = 112 - message length
+ *  - else padlen = 128 + 112 - message length
+ */
+static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
+				      struct aspeed_sham_reqctx *rctx)
+{
+	unsigned int index, padlen;
+	__be64 bits[2];
+
+	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
+
+	switch (rctx->flags & SHA_FLAGS_MASK) {
+	case SHA_FLAGS_SHA1:
+	case SHA_FLAGS_SHA224:
+	case SHA_FLAGS_SHA256:
+		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
+		index = rctx->bufcnt & 0x3f;
+		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
+		*(rctx->buffer + rctx->bufcnt) = 0x80;
+		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
+		rctx->bufcnt += padlen + 8;
+		break;
+	default:
+		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
+		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
+				      rctx->digcnt[0] >> 61);
+		index = rctx->bufcnt & 0x7f;
+		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
+		*(rctx->buffer + rctx->bufcnt) = 0x80;
+		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
+		rctx->bufcnt += padlen + 16;
+		break;
+	}
+}
+
+/*
+ * Prepare DMA buffer before hardware engine
+ * processing.
+ */
+static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	int length, remain;
+
+	length = rctx->total + rctx->bufcnt;
+	remain = length % rctx->block_size;
+
+	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
+
+	if (rctx->bufcnt)
+		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
+
+	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
+		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
+					 rctx->bufcnt, rctx->src_sg,
+					 rctx->offset, rctx->total - remain, 0);
+		rctx->offset += rctx->total - remain;
+
+	} else {
+		dev_warn(hace_dev->dev, "Hash data length is too large\n");
+		return -EINVAL;
+	}
+
+	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
+				 rctx->offset, remain, 0);
+
+	rctx->bufcnt = remain;
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		return -ENOMEM;
+	}
+
+	hash_engine->src_length = length - remain;
+	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	return 0;
+}
+
+/*
+ * Prepare DMA buffer as SG list buffer before
+ * hardware engine processing.
+ */
+static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct aspeed_sg_list *src_list;
+	struct scatterlist *s;
+	int length, remain, sg_len, i;
+	int rc = 0;
+
+	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
+	length = rctx->total + rctx->bufcnt - remain;
+
+	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
+		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
+		  "length", length, "remain", remain);
+
+	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+			    DMA_TO_DEVICE);
+	if (!sg_len) {
+		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto free_src_sg;
+	}
+
+	if (rctx->bufcnt != 0) {
+		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
+						       rctx->buffer,
+						       rctx->block_size * 2,
+						       DMA_TO_DEVICE);
+		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+			rc = -ENOMEM;
+			goto free_rctx_digest;
+		}
+
+		src_list[0].phy_addr = rctx->buffer_dma_addr;
+		src_list[0].len = rctx->bufcnt;
+		length -= src_list[0].len;
+
+		/* Last sg list */
+		if (length == 0)
+			src_list[0].len |= HASH_SG_LAST_LIST;
+
+		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
+		src_list[0].len = cpu_to_le32(src_list[0].len);
+		src_list++;
+	}
+
+	if (length != 0) {
+		for_each_sg(rctx->src_sg, s, sg_len, i) {
+			src_list[i].phy_addr = sg_dma_address(s);
+
+			if (length > sg_dma_len(s)) {
+				src_list[i].len = sg_dma_len(s);
+				length -= sg_dma_len(s);
+
+			} else {
+				/* Last sg list */
+				src_list[i].len = length;
+				src_list[i].len |= HASH_SG_LAST_LIST;
+				length = 0;
+			}
+
+			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
+			src_list[i].len = cpu_to_le32(src_list[i].len);
+		}
+	}
+
+	if (length != 0) {
+		rc = -EINVAL;
+		goto free_rctx_buffer;
+	}
+
+	rctx->offset = rctx->total - remain;
+	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
+	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	goto end;
+
+free_rctx_buffer:
+	if (rctx->bufcnt != 0)
+		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+				 rctx->block_size * 2, DMA_TO_DEVICE);
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+free_src_sg:
+	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+		     DMA_TO_DEVICE);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
+
+	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
+
+	return 0;
+}
+
+/*
+ * Copy digest to the corresponding request result.
+ * This function will be called at final() stage.
+ */
+static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+			 rctx->block_size * 2, DMA_TO_DEVICE);
+
+	memcpy(req->result, rctx->digest, rctx->digsize);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+/*
+ * Trigger hardware engines to do the math.
+ */
+static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
+				     aspeed_hace_fn_t resume)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x, length:0x%x\n",
+		  hash_engine->src_dma, hash_engine->digest_dma,
+		  hash_engine->src_length);
+
+	rctx->cmd |= HASH_CMD_INT_ENABLE;
+	hash_engine->resume = resume;
+
+	ast_hace_write(hace_dev, hash_engine->src_dma, ASPEED_HACE_HASH_SRC);
+	ast_hace_write(hace_dev, hash_engine->digest_dma,
+		       ASPEED_HACE_HASH_DIGEST_BUFF);
+	ast_hace_write(hace_dev, hash_engine->digest_dma,
+		       ASPEED_HACE_HASH_KEY_BUFF);
+	ast_hace_write(hace_dev, hash_engine->src_length,
+		       ASPEED_HACE_HASH_DATA_LEN);
+
+	/* Memory barrier to ensure all data setup before engine starts */
+	mb();
+
+	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
+
+	return -EINPROGRESS;
+}
+
+/*
+ * HMAC resume aims to do the second pass produces
+ * the final HMAC code derived from the inner hash
+ * result and the outer key.
+ */
+static int aspeed_ahash_hmac_resume(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+	int rc = 0;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+			 rctx->block_size * 2, DMA_TO_DEVICE);
+
+	/* o key pad + hash sum 1 */
+	memcpy(rctx->buffer, bctx->opad, rctx->block_size);
+	memcpy(rctx->buffer + rctx->block_size, rctx->digest, rctx->digsize);
+
+	rctx->bufcnt = rctx->block_size + rctx->digsize;
+	rctx->digcnt[0] = rctx->block_size + rctx->digsize;
+
+	aspeed_ahash_fill_padding(hace_dev, rctx);
+	memcpy(rctx->digest, rctx->sha_iv, rctx->ivsize);
+
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev, rctx->buffer,
+					       rctx->block_size * 2,
+					       DMA_TO_DEVICE);
+	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+		rc = -ENOMEM;
+		goto free_rctx_digest;
+	}
+
+	hash_engine->src_dma = rctx->buffer_dma_addr;
+	hash_engine->src_length = rctx->bufcnt;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
+
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_req_final(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	int rc = 0;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	aspeed_ahash_fill_padding(hace_dev, rctx);
+
+	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
+					       rctx->digest,
+					       SHA512_DIGEST_SIZE,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
+		rc = -ENOMEM;
+		goto end;
+	}
+
+	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
+					       rctx->buffer,
+					       rctx->block_size * 2,
+					       DMA_TO_DEVICE);
+	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
+		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
+		rc = -ENOMEM;
+		goto free_rctx_digest;
+	}
+
+	hash_engine->src_dma = rctx->buffer_dma_addr;
+	hash_engine->src_length = rctx->bufcnt;
+	hash_engine->digest_dma = rctx->digest_dma_addr;
+
+	if (rctx->flags & SHA_FLAGS_HMAC)
+		return aspeed_hace_ahash_trigger(hace_dev,
+						 aspeed_ahash_hmac_resume);
+
+	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
+
+free_rctx_digest:
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+end:
+	return rc;
+}
+
+static int aspeed_ahash_update_resume_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
+		     DMA_TO_DEVICE);
+
+	if (rctx->bufcnt != 0)
+		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
+				 rctx->block_size * 2,
+				 DMA_TO_DEVICE);
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg, rctx->offset,
+				 rctx->total - rctx->offset, 0);
+
+	rctx->bufcnt = rctx->total - rctx->offset;
+	rctx->cmd &= ~HASH_CMD_HASH_SRC_SG_CTRL;
+
+	if (rctx->flags & SHA_FLAGS_FINUP)
+		return aspeed_ahash_req_final(hace_dev);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+static int aspeed_ahash_update_resume(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	AHASH_DBG(hace_dev, "\n");
+
+	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
+			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
+
+	if (rctx->flags & SHA_FLAGS_FINUP)
+		return aspeed_ahash_req_final(hace_dev);
+
+	return aspeed_ahash_complete(hace_dev);
+}
+
+static int aspeed_ahash_req_update(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	struct ahash_request *req = hash_engine->req;
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	aspeed_hace_fn_t resume;
+	int ret;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	if (hace_dev->version == AST2600_VERSION) {
+		rctx->cmd |= HASH_CMD_HASH_SRC_SG_CTRL;
+		resume = aspeed_ahash_update_resume_sg;
+
+	} else {
+		resume = aspeed_ahash_update_resume;
+	}
+
+	ret = hash_engine->dma_prepare(hace_dev);
+	if (ret)
+		return ret;
+
+	return aspeed_hace_ahash_trigger(hace_dev, resume);
+}
+
+static int aspeed_hace_hash_handle_queue(struct aspeed_hace_dev *hace_dev,
+				  struct ahash_request *req)
+{
+	return crypto_transfer_hash_request_to_engine(
+			hace_dev->crypt_engine_hash, req);
+}
+
+static int aspeed_ahash_do_request(struct crypto_engine *engine, void *areq)
+{
+	struct ahash_request *req = ahash_request_cast(areq);
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_engine_hash *hash_engine;
+	int ret = 0;
+
+	hash_engine = &hace_dev->hash_engine;
+	hash_engine->flags |= CRYPTO_FLAGS_BUSY;
+
+	if (rctx->op == SHA_OP_UPDATE)
+		ret = aspeed_ahash_req_update(hace_dev);
+	else if (rctx->op == SHA_OP_FINAL)
+		ret = aspeed_ahash_req_final(hace_dev);
+
+	if (ret != -EINPROGRESS)
+		return ret;
+
+	return 0;
+}
+
+static int aspeed_ahash_prepare_request(struct crypto_engine *engine,
+					void *areq)
+{
+	struct ahash_request *req = ahash_request_cast(areq);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_engine_hash *hash_engine;
+
+	hash_engine = &hace_dev->hash_engine;
+	hash_engine->req = req;
+
+	if (hace_dev->version == AST2600_VERSION)
+		hash_engine->dma_prepare = aspeed_ahash_dma_prepare_sg;
+	else
+		hash_engine->dma_prepare = aspeed_ahash_dma_prepare;
+
+	return 0;
+}
+
+static int aspeed_sham_update(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
+
+	rctx->total = req->nbytes;
+	rctx->src_sg = req->src;
+	rctx->offset = 0;
+	rctx->src_nents = sg_nents(req->src);
+	rctx->op = SHA_OP_UPDATE;
+
+	rctx->digcnt[0] += rctx->total;
+	if (rctx->digcnt[0] < rctx->total)
+		rctx->digcnt[1]++;
+
+	if (rctx->bufcnt + rctx->total < rctx->block_size) {
+		scatterwalk_map_and_copy(rctx->buffer + rctx->bufcnt,
+					 rctx->src_sg, rctx->offset,
+					 rctx->total, 0);
+		rctx->bufcnt += rctx->total;
+
+		return 0;
+	}
+
+	return aspeed_hace_hash_handle_queue(hace_dev, req);
+}
+
+static int aspeed_sham_shash_digest(struct crypto_shash *tfm, u32 flags,
+				    const u8 *data, unsigned int len, u8 *out)
+{
+	SHASH_DESC_ON_STACK(shash, tfm);
+
+	shash->tfm = tfm;
+
+	return crypto_shash_digest(shash, data, len, out);
+}
+
+static int aspeed_sham_final(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "req->nbytes:%d, rctx->total:%d\n",
+		  req->nbytes, rctx->total);
+	rctx->op = SHA_OP_FINAL;
+
+	return aspeed_hace_hash_handle_queue(hace_dev, req);
+}
+
+static int aspeed_sham_finup(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	int rc1, rc2;
+
+	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
+
+	rctx->flags |= SHA_FLAGS_FINUP;
+
+	rc1 = aspeed_sham_update(req);
+	if (rc1 == -EINPROGRESS || rc1 == -EBUSY)
+		return rc1;
+
+	/*
+	 * final() has to be always called to cleanup resources
+	 * even if update() failed, except EINPROGRESS
+	 */
+	rc2 = aspeed_sham_final(req);
+
+	return rc1 ? : rc2;
+}
+
+static int aspeed_sham_init(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+	AHASH_DBG(hace_dev, "%s: digest size:%d\n",
+		  crypto_tfm_alg_name(&tfm->base),
+		  crypto_ahash_digestsize(tfm));
+
+	rctx->cmd = HASH_CMD_ACC_MODE;
+	rctx->flags = 0;
+
+	switch (crypto_ahash_digestsize(tfm)) {
+	case SHA1_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA1 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA1;
+		rctx->digsize = SHA1_DIGEST_SIZE;
+		rctx->block_size = SHA1_BLOCK_SIZE;
+		rctx->sha_iv = sha1_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha1_iv, rctx->ivsize);
+		break;
+	case SHA224_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA224 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA224;
+		rctx->digsize = SHA224_DIGEST_SIZE;
+		rctx->block_size = SHA224_BLOCK_SIZE;
+		rctx->sha_iv = sha224_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha224_iv, rctx->ivsize);
+		break;
+	case SHA256_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA256 | HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA256;
+		rctx->digsize = SHA256_DIGEST_SIZE;
+		rctx->block_size = SHA256_BLOCK_SIZE;
+		rctx->sha_iv = sha256_iv;
+		rctx->ivsize = 32;
+		memcpy(rctx->digest, sha256_iv, rctx->ivsize);
+		break;
+	case SHA384_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA384 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA384;
+		rctx->digsize = SHA384_DIGEST_SIZE;
+		rctx->block_size = SHA384_BLOCK_SIZE;
+		rctx->sha_iv = (const __be32 *)sha384_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha384_iv, rctx->ivsize);
+		break;
+	case SHA512_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512;
+		rctx->digsize = SHA512_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = (const __be32 *)sha512_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_iv, rctx->ivsize);
+		break;
+	default:
+		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
+			 crypto_ahash_digestsize(tfm));
+		return -EINVAL;
+	}
+
+	rctx->bufcnt = 0;
+	rctx->total = 0;
+	rctx->digcnt[0] = 0;
+	rctx->digcnt[1] = 0;
+
+	/* HMAC init */
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		rctx->digcnt[0] = rctx->block_size;
+		rctx->bufcnt = rctx->block_size;
+		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
+		rctx->flags |= SHA_FLAGS_HMAC;
+	}
+
+	return 0;
+}
+
+static int aspeed_sha512s_init(struct ahash_request *req)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+	AHASH_DBG(hace_dev, "digest size: %d\n", crypto_ahash_digestsize(tfm));
+
+	rctx->cmd = HASH_CMD_ACC_MODE;
+	rctx->flags = 0;
+
+	switch (crypto_ahash_digestsize(tfm)) {
+	case SHA224_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_224 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512_224;
+		rctx->digsize = SHA224_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = sha512_224_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_224_iv, rctx->ivsize);
+		break;
+	case SHA256_DIGEST_SIZE:
+		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_256 |
+			     HASH_CMD_SHA_SWAP;
+		rctx->flags |= SHA_FLAGS_SHA512_256;
+		rctx->digsize = SHA256_DIGEST_SIZE;
+		rctx->block_size = SHA512_BLOCK_SIZE;
+		rctx->sha_iv = sha512_256_iv;
+		rctx->ivsize = 64;
+		memcpy(rctx->digest, sha512_256_iv, rctx->ivsize);
+		break;
+	default:
+		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
+			 crypto_ahash_digestsize(tfm));
+		return -EINVAL;
+	}
+
+	rctx->bufcnt = 0;
+	rctx->total = 0;
+	rctx->digcnt[0] = 0;
+	rctx->digcnt[1] = 0;
+
+	/* HMAC init */
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		rctx->digcnt[0] = rctx->block_size;
+		rctx->bufcnt = rctx->block_size;
+		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
+		rctx->flags |= SHA_FLAGS_HMAC;
+	}
+
+	return 0;
+}
+
+static int aspeed_sham_digest(struct ahash_request *req)
+{
+	return aspeed_sham_init(req) ? : aspeed_sham_finup(req);
+}
+
+static int aspeed_sham_setkey(struct crypto_ahash *tfm, const u8 *key,
+			      unsigned int keylen)
+{
+	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+	int ds = crypto_shash_digestsize(bctx->shash);
+	int bs = crypto_shash_blocksize(bctx->shash);
+	int err = 0;
+	int i;
+
+	AHASH_DBG(hace_dev, "%s: keylen:%d\n", crypto_tfm_alg_name(&tfm->base),
+		  keylen);
+
+	if (keylen > bs) {
+		err = aspeed_sham_shash_digest(bctx->shash,
+					       crypto_shash_get_flags(bctx->shash),
+					       key, keylen, bctx->ipad);
+		if (err)
+			return err;
+		keylen = ds;
+
+	} else {
+		memcpy(bctx->ipad, key, keylen);
+	}
+
+	memset(bctx->ipad + keylen, 0, bs - keylen);
+	memcpy(bctx->opad, bctx->ipad, bs);
+
+	for (i = 0; i < bs; i++) {
+		bctx->ipad[i] ^= HMAC_IPAD_VALUE;
+		bctx->opad[i] ^= HMAC_OPAD_VALUE;
+	}
+
+	return err;
+}
+
+static int aspeed_sham_cra_init(struct crypto_tfm *tfm)
+{
+	struct ahash_alg *alg = __crypto_ahash_alg(tfm->__crt_alg);
+	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
+	struct aspeed_hace_alg *ast_alg;
+
+	ast_alg = container_of(alg, struct aspeed_hace_alg, alg.ahash);
+	tctx->hace_dev = ast_alg->hace_dev;
+	tctx->flags = 0;
+
+	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
+				 sizeof(struct aspeed_sham_reqctx));
+
+	if (ast_alg->alg_base) {
+		/* hmac related */
+		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+		tctx->flags |= SHA_FLAGS_HMAC;
+		bctx->shash = crypto_alloc_shash(ast_alg->alg_base, 0,
+						 CRYPTO_ALG_NEED_FALLBACK);
+		if (IS_ERR(bctx->shash)) {
+			dev_warn(ast_alg->hace_dev->dev,
+				 "base driver '%s' could not be loaded.\n",
+				 ast_alg->alg_base);
+			return PTR_ERR(bctx->shash);
+		}
+	}
+
+	tctx->enginectx.op.do_one_request = aspeed_ahash_do_request;
+	tctx->enginectx.op.prepare_request = aspeed_ahash_prepare_request;
+	tctx->enginectx.op.unprepare_request = NULL;
+
+	return 0;
+}
+
+static void aspeed_sham_cra_exit(struct crypto_tfm *tfm)
+{
+	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
+
+	AHASH_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(tfm));
+
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
+
+		crypto_free_shash(bctx->shash);
+	}
+}
+
+static int aspeed_sham_export(struct ahash_request *req, void *out)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	memcpy(out, rctx, sizeof(*rctx));
+
+	return 0;
+}
+
+static int aspeed_sham_import(struct ahash_request *req, const void *in)
+{
+	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	memcpy(rctx, in, sizeof(*rctx));
+
+	return 0;
+}
+
+struct aspeed_hace_alg aspeed_ahash_algs[] = {
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA1_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha1",
+					.cra_driver_name	= "aspeed-sha1",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA1_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha256",
+					.cra_driver_name	= "aspeed-sha256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA256_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha224",
+					.cra_driver_name	= "aspeed-sha224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA224_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha1",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA1_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha1)",
+					.cra_driver_name	= "aspeed-hmac-sha1",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA1_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha224",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha224)",
+					.cra_driver_name	= "aspeed-hmac-sha224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA224_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha256",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha256)",
+					.cra_driver_name	= "aspeed-hmac-sha256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA256_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+};
+
+struct aspeed_hace_alg aspeed_ahash_algs_g6[] = {
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA384_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha384",
+					.cra_driver_name	= "aspeed-sha384",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA384_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA512_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512",
+					.cra_driver_name	= "aspeed-sha512",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512_224",
+					.cra_driver_name	= "aspeed-sha512_224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "sha512_256",
+					.cra_driver_name	= "aspeed-sha512_256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha384",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA384_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha384)",
+					.cra_driver_name	= "aspeed-hmac-sha384",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA384_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512",
+		.alg.ahash = {
+			.init	= aspeed_sham_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA512_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512)",
+					.cra_driver_name	= "aspeed-hmac-sha512",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512_224",
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA224_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512_224)",
+					.cra_driver_name	= "aspeed-hmac-sha512_224",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+	{
+		.alg_base = "sha512_256",
+		.alg.ahash = {
+			.init	= aspeed_sha512s_init,
+			.update	= aspeed_sham_update,
+			.final	= aspeed_sham_final,
+			.finup	= aspeed_sham_finup,
+			.digest	= aspeed_sham_digest,
+			.setkey	= aspeed_sham_setkey,
+			.export	= aspeed_sham_export,
+			.import	= aspeed_sham_import,
+			.halg = {
+				.digestsize = SHA256_DIGEST_SIZE,
+				.statesize = sizeof(struct aspeed_sham_reqctx),
+				.base = {
+					.cra_name		= "hmac(sha512_256)",
+					.cra_driver_name	= "aspeed-hmac-sha512_256",
+					.cra_priority		= 300,
+					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
+								  CRYPTO_ALG_ASYNC |
+								  CRYPTO_ALG_KERN_DRIVER_ONLY,
+					.cra_blocksize		= SHA512_BLOCK_SIZE,
+					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
+								sizeof(struct aspeed_sha_hmac_ctx),
+					.cra_alignmask		= 0,
+					.cra_module		= THIS_MODULE,
+					.cra_init		= aspeed_sham_cra_init,
+					.cra_exit		= aspeed_sham_cra_exit,
+				}
+			}
+		},
+	},
+};
+
+void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++)
+		crypto_unregister_ahash(&aspeed_ahash_algs[i].alg.ahash);
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++)
+		crypto_unregister_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
+}
+
+void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int rc, i;
+
+	AHASH_DBG(hace_dev, "\n");
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++) {
+		aspeed_ahash_algs[i].hace_dev = hace_dev;
+		rc = crypto_register_ahash(&aspeed_ahash_algs[i].alg.ahash);
+		if (rc) {
+			AHASH_DBG(hace_dev, "Failed to register %s\n",
+				  aspeed_ahash_algs[i].alg.ahash.halg.base.cra_name);
+		}
+	}
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++) {
+		aspeed_ahash_algs_g6[i].hace_dev = hace_dev;
+		rc = crypto_register_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
+		if (rc) {
+			AHASH_DBG(hace_dev, "Failed to register %s\n",
+				  aspeed_ahash_algs_g6[i].alg.ahash.halg.base.cra_name);
+		}
+	}
+}
diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
new file mode 100644
index 000000000000..89b1585d72e2
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include <linux/clk.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_irq.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#include "aspeed-hace.h"
+
+#ifdef ASPEED_HACE_DEBUG
+#define HACE_DBG(d, fmt, ...)	\
+	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define HACE_DBG(d, fmt, ...)	\
+	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+/* Weak function for HACE hash */
+void __weak aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+/* HACE interrupt service routine */
+static irqreturn_t aspeed_hace_irq(int irq, void *dev)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+	u32 sts;
+
+	sts = ast_hace_read(hace_dev, ASPEED_HACE_STS);
+	ast_hace_write(hace_dev, sts, ASPEED_HACE_STS);
+
+	HACE_DBG(hace_dev, "irq status: 0x%x\n", sts);
+
+	if (sts & HACE_HASH_ISR) {
+		if (hash_engine->flags & CRYPTO_FLAGS_BUSY)
+			tasklet_schedule(&hash_engine->done_task);
+		else
+			dev_warn(hace_dev->dev, "HASH no active requests.\n");
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void aspeed_hace_hash_done_task(unsigned long data)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+
+	hash_engine->resume(hace_dev);
+}
+
+static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
+{
+	aspeed_register_hace_hash_algs(hace_dev);
+}
+
+static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
+{
+	aspeed_unregister_hace_hash_algs(hace_dev);
+}
+
+static const struct of_device_id aspeed_hace_of_matches[] = {
+	{ .compatible = "aspeed,ast2500-hace", .data = (void *)5, },
+	{ .compatible = "aspeed,ast2600-hace", .data = (void *)6, },
+	{},
+};
+
+static int aspeed_hace_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *hace_dev_id;
+	struct aspeed_engine_hash *hash_engine;
+	struct aspeed_hace_dev *hace_dev;
+	struct resource *res;
+	int rc;
+
+	hace_dev = devm_kzalloc(&pdev->dev, sizeof(struct aspeed_hace_dev),
+				GFP_KERNEL);
+	if (!hace_dev)
+		return -ENOMEM;
+
+	hace_dev_id = of_match_device(aspeed_hace_of_matches, &pdev->dev);
+	if (!hace_dev_id) {
+		dev_err(&pdev->dev, "Failed to match hace dev id\n");
+		return -EINVAL;
+	}
+
+	hace_dev->dev = &pdev->dev;
+	hace_dev->version = (unsigned long)hace_dev_id->data;
+	hash_engine = &hace_dev->hash_engine;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+
+	platform_set_drvdata(pdev, hace_dev);
+
+	hace_dev->regs = devm_ioremap_resource(&pdev->dev, res);
+	if (!hace_dev->regs) {
+		dev_err(&pdev->dev, "Failed to map resources\n");
+		return -ENOMEM;
+	}
+
+	/* Get irq number and register it */
+	hace_dev->irq = platform_get_irq(pdev, 0);
+	if (!hace_dev->irq) {
+		dev_err(&pdev->dev, "Failed to get interrupt\n");
+		return -ENXIO;
+	}
+
+	rc = devm_request_irq(&pdev->dev, hace_dev->irq, aspeed_hace_irq, 0,
+			      dev_name(&pdev->dev), hace_dev);
+	if (rc) {
+		dev_err(&pdev->dev, "Failed to request interrupt\n");
+		return rc;
+	}
+
+	/* Get clk and enable it */
+	hace_dev->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(hace_dev->clk)) {
+		dev_err(&pdev->dev, "Failed to get clk\n");
+		return -ENODEV;
+	}
+
+	rc = clk_prepare_enable(hace_dev->clk);
+	if (rc) {
+		dev_err(&pdev->dev, "Failed to enable clock 0x%x\n", rc);
+		return rc;
+	}
+
+	/* Initialize crypto hardware engine structure for hash */
+	hace_dev->crypt_engine_hash = crypto_engine_alloc_init(hace_dev->dev,
+							       true);
+	if (!hace_dev->crypt_engine_hash) {
+		rc = -ENOMEM;
+		goto clk_exit;
+	}
+
+	rc = crypto_engine_start(hace_dev->crypt_engine_hash);
+	if (rc)
+		goto err_engine_hash_start;
+
+	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
+		     (unsigned long)hace_dev);
+
+	/* Allocate DMA buffer for hash engine input used */
+	hash_engine->ahash_src_addr =
+		dmam_alloc_coherent(&pdev->dev,
+				    ASPEED_HASH_SRC_DMA_BUF_LEN,
+				    &hash_engine->ahash_src_dma_addr,
+				    GFP_KERNEL);
+	if (!hash_engine->ahash_src_addr) {
+		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
+		rc = -ENOMEM;
+		goto err_engine_hash_start;
+	}
+
+	aspeed_hace_register(hace_dev);
+
+	dev_info(&pdev->dev, "Aspeed Crypto Accelerator successfully registered\n");
+
+	return 0;
+
+err_engine_hash_start:
+	crypto_engine_exit(hace_dev->crypt_engine_hash);
+clk_exit:
+	clk_disable_unprepare(hace_dev->clk);
+
+	return rc;
+}
+
+static int aspeed_hace_remove(struct platform_device *pdev)
+{
+	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
+	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
+
+	aspeed_hace_unregister(hace_dev);
+
+	crypto_engine_exit(hace_dev->crypt_engine_hash);
+
+	tasklet_kill(&hash_engine->done_task);
+
+	clk_disable_unprepare(hace_dev->clk);
+
+	return 0;
+}
+
+MODULE_DEVICE_TABLE(of, aspeed_hace_of_matches);
+
+static struct platform_driver aspeed_hace_driver = {
+	.probe		= aspeed_hace_probe,
+	.remove		= aspeed_hace_remove,
+	.driver         = {
+		.name   = KBUILD_MODNAME,
+		.of_match_table = aspeed_hace_of_matches,
+	},
+};
+
+module_platform_driver(aspeed_hace_driver);
+
+MODULE_AUTHOR("Neal Liu <neal_liu@aspeedtech.com>");
+MODULE_DESCRIPTION("Aspeed HACE driver Crypto Accelerator");
+MODULE_LICENSE("GPL");
diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
new file mode 100644
index 000000000000..3494ff22f69d
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace.h
@@ -0,0 +1,186 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+#ifndef __ASPEED_HACE_H__
+#define __ASPEED_HACE_H__
+
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/fips.h>
+#include <linux/dma-mapping.h>
+#include <crypto/scatterwalk.h>
+#include <crypto/internal/aead.h>
+#include <crypto/internal/akcipher.h>
+#include <crypto/internal/hash.h>
+#include <crypto/internal/kpp.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/algapi.h>
+#include <crypto/engine.h>
+#include <crypto/hmac.h>
+#include <crypto/sha1.h>
+#include <crypto/sha2.h>
+
+/*****************************
+ *                           *
+ * HACE register definitions *
+ *                           *
+ * ***************************/
+
+#define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
+#define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
+#define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
+#define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
+#define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
+#define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
+
+/* interrupt status reg */
+#define  HACE_HASH_ISR			BIT(9)
+#define  HACE_HASH_BUSY			BIT(0)
+
+/* hash cmd reg */
+#define  HASH_CMD_MBUS_REQ_SYNC_EN	BIT(20)
+#define  HASH_CMD_HASH_SRC_SG_CTRL	BIT(18)
+#define  HASH_CMD_SHA512_224		(0x3 << 10)
+#define  HASH_CMD_SHA512_256		(0x2 << 10)
+#define  HASH_CMD_SHA384		(0x1 << 10)
+#define  HASH_CMD_SHA512		(0)
+#define  HASH_CMD_INT_ENABLE		BIT(9)
+#define  HASH_CMD_HMAC			(0x1 << 7)
+#define  HASH_CMD_ACC_MODE		(0x2 << 7)
+#define  HASH_CMD_HMAC_KEY		(0x3 << 7)
+#define  HASH_CMD_SHA1			(0x2 << 4)
+#define  HASH_CMD_SHA224		(0x4 << 4)
+#define  HASH_CMD_SHA256		(0x5 << 4)
+#define  HASH_CMD_SHA512_SER		(0x6 << 4)
+#define  HASH_CMD_SHA_SWAP		(0x2 << 2)
+
+#define HASH_SG_LAST_LIST		BIT(31)
+
+#define CRYPTO_FLAGS_BUSY		BIT(1)
+
+#define SHA_OP_UPDATE			1
+#define SHA_OP_FINAL			2
+
+#define SHA_FLAGS_SHA1			BIT(0)
+#define SHA_FLAGS_SHA224		BIT(1)
+#define SHA_FLAGS_SHA256		BIT(2)
+#define SHA_FLAGS_SHA384		BIT(3)
+#define SHA_FLAGS_SHA512		BIT(4)
+#define SHA_FLAGS_SHA512_224		BIT(5)
+#define SHA_FLAGS_SHA512_256		BIT(6)
+#define SHA_FLAGS_HMAC			BIT(8)
+#define SHA_FLAGS_FINUP			BIT(9)
+#define SHA_FLAGS_MASK			(0xff)
+
+#define ASPEED_CRYPTO_SRC_DMA_BUF_LEN	0xa000
+#define ASPEED_CRYPTO_DST_DMA_BUF_LEN	0xa000
+#define ASPEED_CRYPTO_GCM_TAG_OFFSET	0x9ff0
+#define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
+#define ASPEED_HASH_QUEUE_LENGTH	50
+
+struct aspeed_hace_dev;
+
+typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
+
+struct aspeed_sg_list {
+	__le32 len;
+	__le32 phy_addr;
+};
+
+struct aspeed_engine_hash {
+	struct tasklet_struct		done_task;
+	unsigned long			flags;
+	struct ahash_request		*req;
+
+	/* input buffer */
+	void				*ahash_src_addr;
+	dma_addr_t			ahash_src_dma_addr;
+
+	dma_addr_t			src_dma;
+	dma_addr_t			digest_dma;
+
+	size_t				src_length;
+
+	/* callback func */
+	aspeed_hace_fn_t		resume;
+	aspeed_hace_fn_t		dma_prepare;
+};
+
+struct aspeed_sha_hmac_ctx {
+	struct crypto_shash *shash;
+	u8 ipad[SHA512_BLOCK_SIZE];
+	u8 opad[SHA512_BLOCK_SIZE];
+};
+
+struct aspeed_sham_ctx {
+	struct crypto_engine_ctx	enginectx;
+
+	struct aspeed_hace_dev		*hace_dev;
+	unsigned long			flags;	/* hmac flag */
+
+	struct aspeed_sha_hmac_ctx	base[0];
+};
+
+struct aspeed_sham_reqctx {
+	unsigned long		flags;		/* final update flag should no use*/
+	unsigned long		op;		/* final or update */
+	u32			cmd;		/* trigger cmd */
+
+	/* walk state */
+	struct scatterlist	*src_sg;
+	int			src_nents;
+	unsigned int		offset;		/* offset in current sg */
+	unsigned int		total;		/* per update length */
+
+	size_t			digsize;
+	size_t			block_size;
+	size_t			ivsize;
+	const __be32		*sha_iv;
+
+	/* remain data buffer */
+	u8			buffer[SHA512_BLOCK_SIZE * 2];
+	dma_addr_t		buffer_dma_addr;
+	size_t			bufcnt;		/* buffer counter */
+
+	/* output buffer */
+	u8			digest[SHA512_DIGEST_SIZE] __aligned(64);
+	dma_addr_t		digest_dma_addr;
+	u64			digcnt[2];
+};
+
+struct aspeed_hace_dev {
+	void __iomem			*regs;
+	struct device			*dev;
+	int				irq;
+	struct clk			*clk;
+	unsigned long			version;
+
+	struct crypto_engine		*crypt_engine_hash;
+
+	struct aspeed_engine_hash	hash_engine;
+};
+
+struct aspeed_hace_alg {
+	struct aspeed_hace_dev		*hace_dev;
+
+	const char			*alg_base;
+
+	union {
+		struct skcipher_alg	skcipher;
+		struct ahash_alg	ahash;
+	} alg;
+};
+
+enum aspeed_version {
+	AST2500_VERSION = 5,
+	AST2600_VERSION
+};
+
+#define ast_hace_write(hace, val, offset)	\
+	writel((val), (hace)->regs + (offset))
+#define ast_hace_read(hace, offset)		\
+	readl((hace)->regs + (offset))
+
+void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+
+#endif
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 2/5] dt-bindings: clock: Add AST2500/AST2600 HACE reset definition
  2022-07-26 11:34 ` Neal Liu
@ 2022-07-26 11:34   ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Krzysztof Kozlowski

Add HACE reset bit definition for AST2500/AST2600.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 include/dt-bindings/clock/aspeed-clock.h  | 1 +
 include/dt-bindings/clock/ast2600-clock.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/dt-bindings/clock/aspeed-clock.h b/include/dt-bindings/clock/aspeed-clock.h
index 9ff4f6e4558c..06d568382c77 100644
--- a/include/dt-bindings/clock/aspeed-clock.h
+++ b/include/dt-bindings/clock/aspeed-clock.h
@@ -52,5 +52,6 @@
 #define ASPEED_RESET_I2C		7
 #define ASPEED_RESET_AHB		8
 #define ASPEED_RESET_CRT1		9
+#define ASPEED_RESET_HACE		10
 
 #endif
diff --git a/include/dt-bindings/clock/ast2600-clock.h b/include/dt-bindings/clock/ast2600-clock.h
index 62b9520a00fd..d8b0db2f7a7d 100644
--- a/include/dt-bindings/clock/ast2600-clock.h
+++ b/include/dt-bindings/clock/ast2600-clock.h
@@ -111,6 +111,7 @@
 #define ASPEED_RESET_PCIE_RC_O		19
 #define ASPEED_RESET_PCIE_RC_OEN	18
 #define ASPEED_RESET_PCI_DP		5
+#define ASPEED_RESET_HACE		4
 #define ASPEED_RESET_AHB		1
 #define ASPEED_RESET_SDRAM		0
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 2/5] dt-bindings: clock: Add AST2500/AST2600 HACE reset definition
@ 2022-07-26 11:34   ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Krzysztof Kozlowski

Add HACE reset bit definition for AST2500/AST2600.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 include/dt-bindings/clock/aspeed-clock.h  | 1 +
 include/dt-bindings/clock/ast2600-clock.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/dt-bindings/clock/aspeed-clock.h b/include/dt-bindings/clock/aspeed-clock.h
index 9ff4f6e4558c..06d568382c77 100644
--- a/include/dt-bindings/clock/aspeed-clock.h
+++ b/include/dt-bindings/clock/aspeed-clock.h
@@ -52,5 +52,6 @@
 #define ASPEED_RESET_I2C		7
 #define ASPEED_RESET_AHB		8
 #define ASPEED_RESET_CRT1		9
+#define ASPEED_RESET_HACE		10
 
 #endif
diff --git a/include/dt-bindings/clock/ast2600-clock.h b/include/dt-bindings/clock/ast2600-clock.h
index 62b9520a00fd..d8b0db2f7a7d 100644
--- a/include/dt-bindings/clock/ast2600-clock.h
+++ b/include/dt-bindings/clock/ast2600-clock.h
@@ -111,6 +111,7 @@
 #define ASPEED_RESET_PCIE_RC_O		19
 #define ASPEED_RESET_PCIE_RC_OEN	18
 #define ASPEED_RESET_PCI_DP		5
+#define ASPEED_RESET_HACE		4
 #define ASPEED_RESET_AHB		1
 #define ASPEED_RESET_SDRAM		0
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 3/5] ARM: dts: aspeed: Add HACE device controller node
  2022-07-26 11:34 ` Neal Liu
@ 2022-07-26 11:34   ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Dhananjay Phadke

Add hace node to device tree for AST2500/AST2600.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Reviewed-by: Dhananjay Phadke <dphadke@linux.microsoft.com>
---
 arch/arm/boot/dts/aspeed-g5.dtsi | 8 ++++++++
 arch/arm/boot/dts/aspeed-g6.dtsi | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index c89092c3905b..04f98d1dbb97 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -262,6 +262,14 @@ rng: hwrng@1e6e2078 {
 				quality = <100>;
 			};
 
+			hace: crypto@1e6e3000 {
+				compatible = "aspeed,ast2500-hace";
+				reg = <0x1e6e3000 0x100>;
+				interrupts = <4>;
+				clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+				resets = <&syscon ASPEED_RESET_HACE>;
+			};
+
 			gfx: display@1e6e6000 {
 				compatible = "aspeed,ast2500-gfx", "syscon";
 				reg = <0x1e6e6000 0x1000>;
diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index 6660564855ff..095cf8d03616 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -323,6 +323,14 @@ apb {
 			#size-cells = <1>;
 			ranges;
 
+			hace: crypto@1e6d0000 {
+				compatible = "aspeed,ast2600-hace";
+				reg = <0x1e6d0000 0x200>;
+				interrupts = <GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>;
+				clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+				resets = <&syscon ASPEED_RESET_HACE>;
+			};
+
 			syscon: syscon@1e6e2000 {
 				compatible = "aspeed,ast2600-scu", "syscon", "simple-mfd";
 				reg = <0x1e6e2000 0x1000>;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 3/5] ARM: dts: aspeed: Add HACE device controller node
@ 2022-07-26 11:34   ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Dhananjay Phadke

Add hace node to device tree for AST2500/AST2600.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Reviewed-by: Dhananjay Phadke <dphadke@linux.microsoft.com>
---
 arch/arm/boot/dts/aspeed-g5.dtsi | 8 ++++++++
 arch/arm/boot/dts/aspeed-g6.dtsi | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index c89092c3905b..04f98d1dbb97 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -262,6 +262,14 @@ rng: hwrng@1e6e2078 {
 				quality = <100>;
 			};
 
+			hace: crypto@1e6e3000 {
+				compatible = "aspeed,ast2500-hace";
+				reg = <0x1e6e3000 0x100>;
+				interrupts = <4>;
+				clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+				resets = <&syscon ASPEED_RESET_HACE>;
+			};
+
 			gfx: display@1e6e6000 {
 				compatible = "aspeed,ast2500-gfx", "syscon";
 				reg = <0x1e6e6000 0x1000>;
diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index 6660564855ff..095cf8d03616 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -323,6 +323,14 @@ apb {
 			#size-cells = <1>;
 			ranges;
 
+			hace: crypto@1e6d0000 {
+				compatible = "aspeed,ast2600-hace";
+				reg = <0x1e6d0000 0x200>;
+				interrupts = <GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>;
+				clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+				resets = <&syscon ASPEED_RESET_HACE>;
+			};
+
 			syscon: syscon@1e6e2000 {
 				compatible = "aspeed,ast2600-scu", "syscon", "simple-mfd";
 				reg = <0x1e6e2000 0x1000>;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 4/5] dt-bindings: crypto: add documentation for aspeed hace
  2022-07-26 11:34 ` Neal Liu
@ 2022-07-26 11:34   ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Krzysztof Kozlowski

Add device tree binding documentation for the Aspeed Hash
and Crypto Engines (HACE) Controller.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 .../bindings/crypto/aspeed,ast2500-hace.yaml  | 53 +++++++++++++++++++
 1 file changed, 53 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml

diff --git a/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml b/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
new file mode 100644
index 000000000000..a772d232de09
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/crypto/aspeed,ast2500-hace.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ASPEED HACE hash and crypto Hardware Accelerator Engines
+
+maintainers:
+  - Neal Liu <neal_liu@aspeedtech.com>
+
+description: |
+  The Hash and Crypto Engine (HACE) is designed to accelerate the throughput
+  of hash data digest, encryption, and decryption. Basically, HACE can be
+  divided into two independently engines - Hash Engine and Crypto Engine.
+
+properties:
+  compatible:
+    enum:
+      - aspeed,ast2500-hace
+      - aspeed,ast2600-hace
+
+  reg:
+    maxItems: 1
+
+  clocks:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  resets:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - interrupts
+  - resets
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/clock/ast2600-clock.h>
+    hace: crypto@1e6d0000 {
+        compatible = "aspeed,ast2600-hace";
+        reg = <0x1e6d0000 0x200>;
+        interrupts = <4>;
+        clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+        resets = <&syscon ASPEED_RESET_HACE>;
+    };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 4/5] dt-bindings: crypto: add documentation for aspeed hace
@ 2022-07-26 11:34   ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW, Krzysztof Kozlowski

Add device tree binding documentation for the Aspeed Hash
and Crypto Engines (HACE) Controller.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 .../bindings/crypto/aspeed,ast2500-hace.yaml  | 53 +++++++++++++++++++
 1 file changed, 53 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml

diff --git a/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml b/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
new file mode 100644
index 000000000000..a772d232de09
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/crypto/aspeed,ast2500-hace.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ASPEED HACE hash and crypto Hardware Accelerator Engines
+
+maintainers:
+  - Neal Liu <neal_liu@aspeedtech.com>
+
+description: |
+  The Hash and Crypto Engine (HACE) is designed to accelerate the throughput
+  of hash data digest, encryption, and decryption. Basically, HACE can be
+  divided into two independently engines - Hash Engine and Crypto Engine.
+
+properties:
+  compatible:
+    enum:
+      - aspeed,ast2500-hace
+      - aspeed,ast2600-hace
+
+  reg:
+    maxItems: 1
+
+  clocks:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  resets:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - interrupts
+  - resets
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/clock/ast2600-clock.h>
+    hace: crypto@1e6d0000 {
+        compatible = "aspeed,ast2600-hace";
+        reg = <0x1e6d0000 0x200>;
+        interrupts = <4>;
+        clocks = <&syscon ASPEED_CLK_GATE_YCLK>;
+        resets = <&syscon ASPEED_RESET_HACE>;
+    };
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
  2022-07-26 11:34 ` Neal Liu
@ 2022-07-26 11:34   ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Add HACE crypto driver to support symmetric-key
encryption and decryption with multiple modes of
operation.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
---
 drivers/crypto/aspeed/Kconfig              |   26 +
 drivers/crypto/aspeed/Makefile             |    7 +-
 drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121 ++++++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
 drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
 5 files changed, 1354 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c

diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
index 059e627efef8..f19994915a5e 100644
--- a/drivers/crypto/aspeed/Kconfig
+++ b/drivers/crypto/aspeed/Kconfig
@@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
 	  to ask for those messages.
 	  Avoid enabling this option for production build to
 	  minimize driver timing.
+
+config CRYPTO_DEV_ASPEED_HACE_CRYPTO
+	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
+	depends on CRYPTO_DEV_ASPEED
+	select CRYPTO_ENGINE
+	select CRYPTO_AES
+	select CRYPTO_DES
+	select CRYPTO_ECB
+	select CRYPTO_CBC
+	select CRYPTO_CFB
+	select CRYPTO_OFB
+	select CRYPTO_CTR
+	help
+	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
+	  crypto driver.
+	  Supports AES/DES symmetric-key encryption and decryption
+	  with ECB/CBC/CFB/OFB/CTR options.
+
+config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
+	bool "Enable HACE crypto debug messages"
+	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
+	help
+	  Print HACE crypto debugging messages if you use this option
+	  to ask for those messages.
+	  Avoid enabling this option for production build to
+	  minimize driver timing.
diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
index 8bc8d4fed5a9..421e2ca9c53e 100644
--- a/drivers/crypto/aspeed/Makefile
+++ b/drivers/crypto/aspeed/Makefile
@@ -1,6 +1,9 @@
 obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
-aspeed_crypto-objs := aspeed-hace.o \
-		      $(hace-hash-y)
+aspeed_crypto-objs := aspeed-hace.o	\
+		      $(hace-hash-y)	\
+		      $(hace-crypto-y)
 
 obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
 hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
+obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) += aspeed-hace-crypto.o
+hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) := aspeed-hace-crypto.o
diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
new file mode 100644
index 000000000000..1d019b523399
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c
@@ -0,0 +1,1121 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include "aspeed-hace.h"
+
+#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
+#define CIPHER_DBG(h, fmt, ...)	\
+	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define CIPHER_DBG(h, fmt, ...)	\
+	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+static int aspeed_crypto_do_fallback(struct skcipher_request *areq)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(areq);
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	int err;
+
+	skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback_tfm);
+	skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+				      areq->base.complete, areq->base.data);
+	skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
+				   areq->cryptlen, areq->iv);
+
+	if (rctx->enc_cmd & HACE_CMD_ENCRYPT)
+		err = crypto_skcipher_encrypt(&rctx->fallback_req);
+	else
+		err = crypto_skcipher_decrypt(&rctx->fallback_req);
+
+	return err;
+}
+
+static bool aspeed_crypto_need_fallback(struct skcipher_request *areq)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(areq);
+
+	if (areq->cryptlen == 0)
+		return true;
+
+	if ((rctx->enc_cmd & HACE_CMD_DES_SELECT) &&
+	    !IS_ALIGNED(areq->cryptlen, DES_BLOCK_SIZE))
+		return true;
+
+	if ((!(rctx->enc_cmd & HACE_CMD_DES_SELECT)) &&
+	    !IS_ALIGNED(areq->cryptlen, AES_BLOCK_SIZE))
+		return true;
+
+	return false;
+}
+
+static int aspeed_hace_crypto_handle_queue(struct aspeed_hace_dev *hace_dev,
+					   struct skcipher_request *req)
+{
+	if (hace_dev->version == AST2500_VERSION &&
+	    aspeed_crypto_need_fallback(req)) {
+		CIPHER_DBG(hace_dev, "SW fallback\n");
+		return aspeed_crypto_do_fallback(req);
+	}
+
+	return crypto_transfer_skcipher_request_to_engine(
+			hace_dev->crypt_engine_crypto, req);
+}
+
+static int aspeed_crypto_do_request(struct crypto_engine *engine, void *areq)
+{
+	struct skcipher_request *req = skcipher_request_cast(areq);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	struct aspeed_engine_crypto *crypto_engine;
+	int rc;
+
+	crypto_engine = &hace_dev->crypto_engine;
+	crypto_engine->req = req;
+	crypto_engine->flags |= CRYPTO_FLAGS_BUSY;
+
+	rc = ctx->start(hace_dev);
+
+	if (rc != -EINPROGRESS)
+		return -EIO;
+
+	return 0;
+}
+
+static int aspeed_sk_complete(struct aspeed_hace_dev *hace_dev, int err)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	if (rctx->enc_cmd & HACE_CMD_IV_REQUIRE) {
+		if (rctx->enc_cmd & HACE_CMD_DES_SELECT)
+			memcpy(req->iv, crypto_engine->cipher_ctx +
+			       DES_KEY_SIZE, DES_KEY_SIZE);
+		else
+			memcpy(req->iv, crypto_engine->cipher_ctx,
+			       AES_BLOCK_SIZE);
+	}
+
+	crypto_engine->flags &= ~CRYPTO_FLAGS_BUSY;
+
+	crypto_finalize_skcipher_request(hace_dev->crypt_engine_crypto, req,
+					 err);
+
+	return err;
+}
+
+static int aspeed_sk_transfer_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct device *dev = hace_dev->dev;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	if (req->src == req->dst) {
+		dma_unmap_sg(dev, req->src, rctx->src_nents, DMA_BIDIRECTIONAL);
+	} else {
+		dma_unmap_sg(dev, req->src, rctx->src_nents, DMA_TO_DEVICE);
+		dma_unmap_sg(dev, req->dst, rctx->dst_nents, DMA_FROM_DEVICE);
+	}
+
+	return aspeed_sk_complete(hace_dev, 0);
+}
+
+static int aspeed_sk_transfer(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *out_sg;
+	int nbytes = 0;
+	int rc = 0;
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	out_sg = req->dst;
+
+	/* Copy output buffer to dst scatter-gather lists */
+	nbytes = sg_copy_from_buffer(out_sg, rctx->dst_nents,
+				     crypto_engine->cipher_addr, req->cryptlen);
+	if (!nbytes) {
+		dev_warn(hace_dev->dev, "invalid sg copy, %s:0x%x, %s:0x%x\n",
+			 "nbytes", nbytes, "cryptlen", req->cryptlen);
+		rc = -EINVAL;
+	}
+
+	CIPHER_DBG(hace_dev, "%s:%d, %s:%d, %s:%d, %s:%p\n",
+		   "nbytes", nbytes, "req->cryptlen", req->cryptlen,
+		   "nb_out_sg", rctx->dst_nents,
+		   "cipher addr", crypto_engine->cipher_addr);
+
+	return aspeed_sk_complete(hace_dev, rc);
+}
+
+static int aspeed_sk_start(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *in_sg;
+	int nbytes;
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	in_sg = req->src;
+
+	nbytes = sg_copy_to_buffer(in_sg, rctx->src_nents,
+				   crypto_engine->cipher_addr, req->cryptlen);
+
+	CIPHER_DBG(hace_dev, "%s:%d, %s:%d, %s:%d, %s:%p\n",
+		   "nbytes", nbytes, "req->cryptlen", req->cryptlen,
+		   "nb_in_sg", rctx->src_nents,
+		   "cipher addr", crypto_engine->cipher_addr);
+
+	if (!nbytes) {
+		dev_warn(hace_dev->dev, "invalid sg copy, %s:0x%x, %s:0x%x\n",
+			 "nbytes", nbytes, "cryptlen", req->cryptlen);
+		return -EINVAL;
+	}
+
+	crypto_engine->resume = aspeed_sk_transfer;
+
+	/* Trigger engines */
+	ast_hace_write(hace_dev, crypto_engine->cipher_dma_addr,
+		       ASPEED_HACE_SRC);
+	ast_hace_write(hace_dev, crypto_engine->cipher_dma_addr,
+		       ASPEED_HACE_DEST);
+	ast_hace_write(hace_dev, req->cryptlen, ASPEED_HACE_DATA_LEN);
+	ast_hace_write(hace_dev, rctx->enc_cmd, ASPEED_HACE_CMD);
+
+	return -EINPROGRESS;
+}
+
+static int aspeed_sk_start_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_sg_list *src_list, *dst_list;
+	dma_addr_t src_dma_addr, dst_dma_addr;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *s;
+	int src_sg_len;
+	int dst_sg_len;
+	int total, i;
+	int rc;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	rctx->enc_cmd |= HACE_CMD_DES_SG_CTRL | HACE_CMD_SRC_SG_CTRL |
+			 HACE_CMD_AES_KEY_HW_EXP | HACE_CMD_MBUS_REQ_SYNC_EN;
+
+	/* BIDIRECTIONAL */
+	if (req->dst == req->src) {
+		src_sg_len = dma_map_sg(hace_dev->dev, req->src,
+					rctx->src_nents, DMA_BIDIRECTIONAL);
+		dst_sg_len = src_sg_len;
+		if (!src_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+			return -EINVAL;
+		}
+
+	} else {
+		src_sg_len = dma_map_sg(hace_dev->dev, req->src,
+					rctx->src_nents, DMA_TO_DEVICE);
+		if (!src_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+			return -EINVAL;
+		}
+
+		dst_sg_len = dma_map_sg(hace_dev->dev, req->dst,
+					rctx->dst_nents, DMA_FROM_DEVICE);
+		if (!dst_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() dst error\n");
+			rc = -EINVAL;
+			goto free_req_src;
+		}
+	}
+
+	src_list = (struct aspeed_sg_list *)crypto_engine->cipher_addr;
+	src_dma_addr = crypto_engine->cipher_dma_addr;
+	total = req->cryptlen;
+
+	for_each_sg(req->src, s, src_sg_len, i) {
+		src_list[i].phy_addr = sg_dma_address(s);
+
+		if (total > sg_dma_len(s)) {
+			src_list[i].len = sg_dma_len(s);
+			total -= src_list[i].len;
+
+		} else {
+			/* last sg list */
+			src_list[i].len = total;
+			src_list[i].len |= BIT(31);
+			total = 0;
+		}
+
+		src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
+		src_list[i].len = cpu_to_le32(src_list[i].len);
+	}
+
+	if (total != 0) {
+		rc = -EINVAL;
+		goto free_req;
+	}
+
+	if (req->dst == req->src) {
+		dst_list = src_list;
+		dst_dma_addr = src_dma_addr;
+
+	} else {
+		dst_list = (struct aspeed_sg_list *)crypto_engine->dst_sg_addr;
+		dst_dma_addr = crypto_engine->dst_sg_dma_addr;
+		total = req->cryptlen;
+
+		for_each_sg(req->dst, s, dst_sg_len, i) {
+			dst_list[i].phy_addr = sg_dma_address(s);
+
+			if (total > sg_dma_len(s)) {
+				dst_list[i].len = sg_dma_len(s);
+				total -= dst_list[i].len;
+
+			} else {
+				/* last sg list */
+				dst_list[i].len = total;
+				dst_list[i].len |= BIT(31);
+				total = 0;
+			}
+
+			dst_list[i].phy_addr = cpu_to_le32(dst_list[i].phy_addr);
+			dst_list[i].len = cpu_to_le32(dst_list[i].len);
+
+		}
+
+		dst_list[dst_sg_len].phy_addr = 0;
+		dst_list[dst_sg_len].len = 0;
+	}
+
+	if (total != 0) {
+		rc = -EINVAL;
+		goto free_req;
+	}
+
+	crypto_engine->resume = aspeed_sk_transfer_sg;
+
+	/* Memory barrier to ensure all data setup before engine starts */
+	mb();
+
+	/* Trigger engines */
+	ast_hace_write(hace_dev, src_dma_addr, ASPEED_HACE_SRC);
+	ast_hace_write(hace_dev, dst_dma_addr, ASPEED_HACE_DEST);
+	ast_hace_write(hace_dev, req->cryptlen, ASPEED_HACE_DATA_LEN);
+	ast_hace_write(hace_dev, rctx->enc_cmd, ASPEED_HACE_CMD);
+
+	return -EINPROGRESS;
+
+free_req:
+	if (req->dst == req->src) {
+		dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
+			     DMA_BIDIRECTIONAL);
+
+	} else {
+		dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents,
+			     DMA_TO_DEVICE);
+		dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
+			     DMA_TO_DEVICE);
+	}
+
+	return rc;
+
+free_req_src:
+	dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents, DMA_TO_DEVICE);
+
+	return rc;
+}
+
+static int aspeed_hace_skcipher_trigger(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct crypto_skcipher *cipher;
+	struct aspeed_cipher_ctx *ctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	cipher = crypto_skcipher_reqtfm(req);
+	ctx = crypto_skcipher_ctx(cipher);
+
+	/* enable interrupt */
+	rctx->enc_cmd |= HACE_CMD_ISR_EN;
+
+	rctx->dst_nents = sg_nents(req->dst);
+	rctx->src_nents = sg_nents(req->src);
+
+	ast_hace_write(hace_dev, crypto_engine->cipher_ctx_dma,
+		       ASPEED_HACE_CONTEXT);
+
+	if (rctx->enc_cmd & HACE_CMD_IV_REQUIRE) {
+		if (rctx->enc_cmd & HACE_CMD_DES_SELECT)
+			memcpy(crypto_engine->cipher_ctx + DES_BLOCK_SIZE,
+			       req->iv, DES_BLOCK_SIZE);
+		else
+			memcpy(crypto_engine->cipher_ctx, req->iv,
+			       AES_BLOCK_SIZE);
+	}
+
+	if (hace_dev->version == AST2600_VERSION) {
+		memcpy(crypto_engine->cipher_ctx + 16, ctx->key, ctx->key_len);
+
+		return aspeed_sk_start_sg(hace_dev);
+	}
+
+	memcpy(crypto_engine->cipher_ctx + 16, ctx->key, AES_MAX_KEYLENGTH);
+
+	return aspeed_sk_start(hace_dev);
+}
+
+static int aspeed_des_crypt(struct skcipher_request *req, u32 cmd)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(req);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	u32 crypto_alg = cmd & HACE_CMD_OP_MODE_MASK;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	if (crypto_alg == HACE_CMD_CBC || crypto_alg == HACE_CMD_ECB) {
+		if (!IS_ALIGNED(req->cryptlen, DES_BLOCK_SIZE))
+			return -EINVAL;
+	}
+
+	rctx->enc_cmd = cmd | HACE_CMD_DES_SELECT | HACE_CMD_RI_WO_DATA_ENABLE |
+			HACE_CMD_DES | HACE_CMD_CONTEXT_LOAD_ENABLE |
+			HACE_CMD_CONTEXT_SAVE_ENABLE;
+
+	return aspeed_hace_crypto_handle_queue(hace_dev, req);
+}
+
+static int aspeed_des_setkey(struct crypto_skcipher *cipher, const u8 *key,
+			     unsigned int keylen)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	int rc;
+
+	CIPHER_DBG(hace_dev, "keylen: %d bits\n", keylen);
+
+	if (keylen != DES_KEY_SIZE && keylen != DES3_EDE_KEY_SIZE) {
+		dev_warn(hace_dev->dev, "invalid keylen: %d bits\n", keylen);
+		return -EINVAL;
+	}
+
+	if (keylen == DES_KEY_SIZE) {
+		rc = crypto_des_verify_key(tfm, key);
+		if (rc)
+			return rc;
+
+	} else if (keylen == DES3_EDE_KEY_SIZE) {
+		rc = crypto_des3_ede_verify_key(tfm, key);
+		if (rc)
+			return rc;
+	}
+
+	memcpy(ctx->key, key, keylen);
+	ctx->key_len = keylen;
+
+	crypto_skcipher_clear_flags(ctx->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(ctx->fallback_tfm, cipher->base.crt_flags &
+				  CRYPTO_TFM_REQ_MASK);
+
+	return crypto_skcipher_setkey(ctx->fallback_tfm, key, keylen);
+}
+
+static int aspeed_tdes_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_des_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_aes_crypt(struct skcipher_request *req, u32 cmd)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(req);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	u32 crypto_alg = cmd & HACE_CMD_OP_MODE_MASK;
+
+	if (crypto_alg == HACE_CMD_CBC || crypto_alg == HACE_CMD_ECB) {
+		if (!IS_ALIGNED(req->cryptlen, AES_BLOCK_SIZE))
+			return -EINVAL;
+	}
+
+	CIPHER_DBG(hace_dev, "%s\n",
+		   (cmd & HACE_CMD_ENCRYPT) ? "encrypt" : "decrypt");
+
+	cmd |= HACE_CMD_AES_SELECT | HACE_CMD_RI_WO_DATA_ENABLE |
+	       HACE_CMD_CONTEXT_LOAD_ENABLE | HACE_CMD_CONTEXT_SAVE_ENABLE;
+
+	switch (ctx->key_len) {
+	case AES_KEYSIZE_128:
+		cmd |= HACE_CMD_AES128;
+		break;
+	case AES_KEYSIZE_192:
+		cmd |= HACE_CMD_AES192;
+		break;
+	case AES_KEYSIZE_256:
+		cmd |= HACE_CMD_AES256;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	rctx->enc_cmd = cmd;
+
+	return aspeed_hace_crypto_handle_queue(hace_dev, req);
+}
+
+static int aspeed_aes_setkey(struct crypto_skcipher *cipher, const u8 *key,
+			     unsigned int keylen)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	struct crypto_aes_ctx gen_aes_key;
+
+	CIPHER_DBG(hace_dev, "keylen: %d bits\n", (keylen * 8));
+
+	if (keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_192 &&
+	    keylen != AES_KEYSIZE_256)
+		return -EINVAL;
+
+	if (ctx->hace_dev->version == AST2500_VERSION) {
+		aes_expandkey(&gen_aes_key, key, keylen);
+		memcpy(ctx->key, gen_aes_key.key_enc, AES_MAX_KEYLENGTH);
+
+	} else {
+		memcpy(ctx->key, key, keylen);
+	}
+
+	ctx->key_len = keylen;
+
+	crypto_skcipher_clear_flags(ctx->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(ctx->fallback_tfm, cipher->base.crt_flags &
+				  CRYPTO_TFM_REQ_MASK);
+
+	return crypto_skcipher_setkey(ctx->fallback_tfm, key, keylen);
+}
+
+static int aspeed_aes_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR);
+}
+
+static int aspeed_aes_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR);
+}
+
+static int aspeed_aes_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB);
+}
+
+static int aspeed_aes_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB);
+}
+
+static int aspeed_aes_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB);
+}
+
+static int aspeed_aes_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB);
+}
+
+static int aspeed_aes_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC);
+}
+
+static int aspeed_aes_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC);
+}
+
+static int aspeed_aes_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB);
+}
+
+static int aspeed_aes_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB);
+}
+
+static int aspeed_crypto_cra_init(struct crypto_skcipher *tfm)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+	const char *name = crypto_tfm_alg_name(&tfm->base);
+	struct aspeed_hace_alg *crypto_alg;
+
+
+	crypto_alg = container_of(alg, struct aspeed_hace_alg, alg.skcipher);
+	ctx->hace_dev = crypto_alg->hace_dev;
+	ctx->start = aspeed_hace_skcipher_trigger;
+
+	CIPHER_DBG(ctx->hace_dev, "%s\n", name);
+
+	ctx->fallback_tfm = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_ASYNC |
+						  CRYPTO_ALG_NEED_FALLBACK);
+	if (IS_ERR(ctx->fallback_tfm)) {
+		dev_err(ctx->hace_dev->dev, "ERROR: Cannot allocate fallback for %s %ld\n",
+			name, PTR_ERR(ctx->fallback_tfm));
+		return PTR_ERR(ctx->fallback_tfm);
+	}
+
+	crypto_skcipher_set_reqsize(tfm, sizeof(struct aspeed_cipher_reqctx) +
+			 crypto_skcipher_reqsize(ctx->fallback_tfm));
+
+	ctx->enginectx.op.do_one_request = aspeed_crypto_do_request;
+	ctx->enginectx.op.prepare_request = NULL;
+	ctx->enginectx.op.unprepare_request = NULL;
+
+	return 0;
+}
+
+static void aspeed_crypto_cra_exit(struct crypto_skcipher *tfm)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+
+	CIPHER_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(&tfm->base));
+	crypto_free_skcipher(ctx->fallback_tfm);
+}
+
+struct aspeed_hace_alg aspeed_crypto_algs[] = {
+	{
+		.alg.skcipher = {
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ecb_encrypt,
+			.decrypt	= aspeed_aes_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(aes)",
+				.cra_driver_name	= "aspeed-ecb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= AES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_cbc_encrypt,
+			.decrypt	= aspeed_aes_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(aes)",
+				.cra_driver_name	= "aspeed-cbc-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= AES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_cfb_encrypt,
+			.decrypt	= aspeed_aes_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(aes)",
+				.cra_driver_name	= "aspeed-cfb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ofb_encrypt,
+			.decrypt	= aspeed_aes_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(aes)",
+				.cra_driver_name	= "aspeed-ofb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ecb_encrypt,
+			.decrypt	= aspeed_des_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(des)",
+				.cra_driver_name	= "aspeed-ecb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_cbc_encrypt,
+			.decrypt	= aspeed_des_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(des)",
+				.cra_driver_name	= "aspeed-cbc-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_cfb_encrypt,
+			.decrypt	= aspeed_des_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(des)",
+				.cra_driver_name	= "aspeed-cfb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ofb_encrypt,
+			.decrypt	= aspeed_des_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(des)",
+				.cra_driver_name	= "aspeed-ofb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ecb_encrypt,
+			.decrypt	= aspeed_tdes_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(des3_ede)",
+				.cra_driver_name	= "aspeed-ecb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_cbc_encrypt,
+			.decrypt	= aspeed_tdes_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(des3_ede)",
+				.cra_driver_name	= "aspeed-cbc-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_cfb_encrypt,
+			.decrypt	= aspeed_tdes_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(des3_ede)",
+				.cra_driver_name	= "aspeed-cfb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ofb_encrypt,
+			.decrypt	= aspeed_tdes_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(des3_ede)",
+				.cra_driver_name	= "aspeed-ofb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+};
+
+struct aspeed_hace_alg aspeed_crypto_algs_g6[] = {
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ctr_encrypt,
+			.decrypt	= aspeed_aes_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(aes)",
+				.cra_driver_name	= "aspeed-ctr-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ctr_encrypt,
+			.decrypt	= aspeed_des_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(des)",
+				.cra_driver_name	= "aspeed-ctr-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ctr_encrypt,
+			.decrypt	= aspeed_tdes_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(des3_ede)",
+				.cra_driver_name	= "aspeed-ctr-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+
+};
+
+void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int rc, i;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
+		aspeed_crypto_algs[i].hace_dev = hace_dev;
+		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
+		if (rc) {
+			CIPHER_DBG(hace_dev, "Failed to register %s\n",
+				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
+		}
+	}
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
+		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
+		rc = crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
+		if (rc) {
+			CIPHER_DBG(hace_dev, "Failed to register %s\n",
+				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
+		}
+	}
+}
diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
index 89b1585d72e2..efc0725ebf98 100644
--- a/drivers/crypto/aspeed/aspeed-hace.c
+++ b/drivers/crypto/aspeed/aspeed-hace.c
@@ -32,10 +32,22 @@ void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
 	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
 }
 
+/* Weak function for HACE crypto */
+void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
 /* HACE interrupt service routine */
 static irqreturn_t aspeed_hace_irq(int irq, void *dev)
 {
 	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
 	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
 	u32 sts;
 
@@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
 			dev_warn(hace_dev->dev, "HASH no active requests.\n");
 	}
 
+	if (sts & HACE_CRYPTO_ISR) {
+		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
+			tasklet_schedule(&crypto_engine->done_task);
+		else
+			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
+	}
+
 	return IRQ_HANDLED;
 }
 
+static void aspeed_hace_crypto_done_task(unsigned long data)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+
+	crypto_engine->resume(hace_dev);
+}
+
 static void aspeed_hace_hash_done_task(unsigned long data)
 {
 	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
@@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned long data)
 static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
 {
 	aspeed_register_hace_hash_algs(hace_dev);
+	aspeed_register_hace_crypto_algs(hace_dev);
 }
 
 static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
 {
 	aspeed_unregister_hace_hash_algs(hace_dev);
+	aspeed_unregister_hace_crypto_algs(hace_dev);
 }
 
 static const struct of_device_id aspeed_hace_of_matches[] = {
@@ -80,6 +109,7 @@ static const struct of_device_id aspeed_hace_of_matches[] = {
 
 static int aspeed_hace_probe(struct platform_device *pdev)
 {
+	struct aspeed_engine_crypto *crypto_engine;
 	const struct of_device_id *hace_dev_id;
 	struct aspeed_engine_hash *hash_engine;
 	struct aspeed_hace_dev *hace_dev;
@@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	hace_dev->dev = &pdev->dev;
 	hace_dev->version = (unsigned long)hace_dev_id->data;
 	hash_engine = &hace_dev->hash_engine;
+	crypto_engine = &hace_dev->crypto_engine;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
@@ -153,6 +184,21 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
 		     (unsigned long)hace_dev);
 
+	/* Initialize crypto hardware engine structure for crypto */
+	hace_dev->crypt_engine_crypto = crypto_engine_alloc_init(hace_dev->dev,
+								 true);
+	if (!hace_dev->crypt_engine_crypto) {
+		rc = -ENOMEM;
+		goto err_engine_hash_start;
+	}
+
+	rc = crypto_engine_start(hace_dev->crypt_engine_crypto);
+	if (rc)
+		goto err_engine_crypto_start;
+
+	tasklet_init(&crypto_engine->done_task, aspeed_hace_crypto_done_task,
+		     (unsigned long)hace_dev);
+
 	/* Allocate DMA buffer for hash engine input used */
 	hash_engine->ahash_src_addr =
 		dmam_alloc_coherent(&pdev->dev,
@@ -162,7 +208,45 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	if (!hash_engine->ahash_src_addr) {
 		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
 		rc = -ENOMEM;
-		goto err_engine_hash_start;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine context used */
+	crypto_engine->cipher_ctx =
+		dmam_alloc_coherent(&pdev->dev,
+				    PAGE_SIZE,
+				    &crypto_engine->cipher_ctx_dma,
+				    GFP_KERNEL);
+	if (!crypto_engine->cipher_ctx) {
+		dev_err(&pdev->dev, "Failed to allocate cipher ctx dma\n");
+		rc = -ENOMEM;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine input used */
+	crypto_engine->cipher_addr =
+		dmam_alloc_coherent(&pdev->dev,
+				    ASPEED_CRYPTO_SRC_DMA_BUF_LEN,
+				    &crypto_engine->cipher_dma_addr,
+				    GFP_KERNEL);
+	if (!crypto_engine->cipher_addr) {
+		dev_err(&pdev->dev, "Failed to allocate cipher addr dma\n");
+		rc = -ENOMEM;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine output used */
+	if (hace_dev->version == AST2600_VERSION) {
+		crypto_engine->dst_sg_addr =
+			dmam_alloc_coherent(&pdev->dev,
+					    ASPEED_CRYPTO_DST_DMA_BUF_LEN,
+					    &crypto_engine->dst_sg_dma_addr,
+					    GFP_KERNEL);
+		if (!crypto_engine->dst_sg_addr) {
+			dev_err(&pdev->dev, "Failed to allocate dst_sg dma\n");
+			rc = -ENOMEM;
+			goto err_engine_crypto_start;
+		}
 	}
 
 	aspeed_hace_register(hace_dev);
@@ -171,6 +255,8 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 
 	return 0;
 
+err_engine_crypto_start:
+	crypto_engine_exit(hace_dev->crypt_engine_crypto);
 err_engine_hash_start:
 	crypto_engine_exit(hace_dev->crypt_engine_hash);
 clk_exit:
@@ -182,13 +268,16 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 static int aspeed_hace_remove(struct platform_device *pdev)
 {
 	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
 	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
 
 	aspeed_hace_unregister(hace_dev);
 
 	crypto_engine_exit(hace_dev->crypt_engine_hash);
+	crypto_engine_exit(hace_dev->crypt_engine_crypto);
 
 	tasklet_kill(&hash_engine->done_task);
+	tasklet_kill(&crypto_engine->done_task);
 
 	clk_disable_unprepare(hace_dev->clk);
 
diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
index 3494ff22f69d..f2cde23b56ae 100644
--- a/drivers/crypto/aspeed/aspeed-hace.h
+++ b/drivers/crypto/aspeed/aspeed-hace.h
@@ -7,9 +7,12 @@
 #include <linux/err.h>
 #include <linux/fips.h>
 #include <linux/dma-mapping.h>
+#include <crypto/aes.h>
+#include <crypto/des.h>
 #include <crypto/scatterwalk.h>
 #include <crypto/internal/aead.h>
 #include <crypto/internal/akcipher.h>
+#include <crypto/internal/des.h>
 #include <crypto/internal/hash.h>
 #include <crypto/internal/kpp.h>
 #include <crypto/internal/skcipher.h>
@@ -24,15 +27,75 @@
  * HACE register definitions *
  *                           *
  * ***************************/
+#define ASPEED_HACE_SRC			0x00	/* Crypto Data Source Base Address Register */
+#define ASPEED_HACE_DEST		0x04	/* Crypto Data Destination Base Address Register */
+#define ASPEED_HACE_CONTEXT		0x08	/* Crypto Context Buffer Base Address Register */
+#define ASPEED_HACE_DATA_LEN		0x0C	/* Crypto Data Length Register */
+#define ASPEED_HACE_CMD			0x10	/* Crypto Engine Command Register */
+
+/* G5 */
+#define ASPEED_HACE_TAG			0x18	/* HACE Tag Register */
+/* G6 */
+#define ASPEED_HACE_GCM_ADD_LEN		0x14	/* Crypto AES-GCM Additional Data Length Register */
+#define ASPEED_HACE_GCM_TAG_BASE_ADDR	0x18	/* Crypto AES-GCM Tag Write Buff Base Address Reg */
 
 #define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
+
 #define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
 #define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
 #define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
 #define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
 #define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
 
+/* crypto cmd */
+#define  HACE_CMD_SINGLE_DES		0
+#define  HACE_CMD_TRIPLE_DES		BIT(17)
+#define  HACE_CMD_AES_SELECT		0
+#define  HACE_CMD_DES_SELECT		BIT(16)
+#define  HACE_CMD_ISR_EN		BIT(12)
+#define  HACE_CMD_CONTEXT_SAVE_ENABLE	(0)
+#define  HACE_CMD_CONTEXT_SAVE_DISABLE	BIT(9)
+#define  HACE_CMD_AES			(0)
+#define  HACE_CMD_DES			(0)
+#define  HACE_CMD_RC4			BIT(8)
+#define  HACE_CMD_DECRYPT		(0)
+#define  HACE_CMD_ENCRYPT		BIT(7)
+
+#define  HACE_CMD_ECB			(0x0 << 4)
+#define  HACE_CMD_CBC			(0x1 << 4)
+#define  HACE_CMD_CFB			(0x2 << 4)
+#define  HACE_CMD_OFB			(0x3 << 4)
+#define  HACE_CMD_CTR			(0x4 << 4)
+#define  HACE_CMD_OP_MODE_MASK		(0x7 << 4)
+
+#define  HACE_CMD_AES128		(0x0 << 2)
+#define  HACE_CMD_AES192		(0x1 << 2)
+#define  HACE_CMD_AES256		(0x2 << 2)
+#define  HACE_CMD_OP_CASCADE		(0x3)
+#define  HACE_CMD_OP_INDEPENDENT	(0x1)
+
+/* G5 */
+#define  HACE_CMD_RI_WO_DATA_ENABLE	(0)
+#define  HACE_CMD_RI_WO_DATA_DISABLE	BIT(11)
+#define  HACE_CMD_CONTEXT_LOAD_ENABLE	(0)
+#define  HACE_CMD_CONTEXT_LOAD_DISABLE	BIT(10)
+/* G6 */
+#define  HACE_CMD_AES_KEY_FROM_OTP	BIT(24)
+#define  HACE_CMD_GHASH_TAG_XOR_EN	BIT(23)
+#define  HACE_CMD_GHASH_PAD_LEN_INV	BIT(22)
+#define  HACE_CMD_GCM_TAG_ADDR_SEL	BIT(21)
+#define  HACE_CMD_MBUS_REQ_SYNC_EN	BIT(20)
+#define  HACE_CMD_DES_SG_CTRL		BIT(19)
+#define  HACE_CMD_SRC_SG_CTRL		BIT(18)
+#define  HACE_CMD_CTR_IV_AES_96		(0x1 << 14)
+#define  HACE_CMD_CTR_IV_DES_32		(0x1 << 14)
+#define  HACE_CMD_CTR_IV_AES_64		(0x2 << 14)
+#define  HACE_CMD_CTR_IV_AES_32		(0x3 << 14)
+#define  HACE_CMD_AES_KEY_HW_EXP	BIT(13)
+#define  HACE_CMD_GCM			(0x5 << 4)
+
 /* interrupt status reg */
+#define  HACE_CRYPTO_ISR		BIT(12)
 #define  HACE_HASH_ISR			BIT(9)
 #define  HACE_HASH_BUSY			BIT(0)
 
@@ -77,6 +140,9 @@
 #define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
 #define ASPEED_HASH_QUEUE_LENGTH	50
 
+#define HACE_CMD_IV_REQUIRE		(HACE_CMD_CBC | HACE_CMD_CFB | \
+					 HACE_CMD_OFB | HACE_CMD_CTR)
+
 struct aspeed_hace_dev;
 
 typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
@@ -147,6 +213,48 @@ struct aspeed_sham_reqctx {
 	u64			digcnt[2];
 };
 
+struct aspeed_engine_crypto {
+	struct tasklet_struct		done_task;
+	unsigned long			flags;
+	struct skcipher_request		*req;
+
+	/* context buffer */
+	void				*cipher_ctx;
+	dma_addr_t			cipher_ctx_dma;
+
+	/* input buffer, could be single/scatter-gather lists */
+	void				*cipher_addr;
+	dma_addr_t			cipher_dma_addr;
+
+	/* output buffer, only used in scatter-gather lists */
+	void				*dst_sg_addr;
+	dma_addr_t			dst_sg_dma_addr;
+
+	/* callback func */
+	aspeed_hace_fn_t		resume;
+};
+
+struct aspeed_cipher_ctx {
+	struct crypto_engine_ctx	enginectx;
+
+	struct aspeed_hace_dev		*hace_dev;
+	int				key_len;
+	u8				key[AES_MAX_KEYLENGTH];
+
+	/* callback func */
+	aspeed_hace_fn_t		start;
+
+	struct crypto_skcipher          *fallback_tfm;
+};
+
+struct aspeed_cipher_reqctx {
+	int enc_cmd;
+	int src_nents;
+	int dst_nents;
+
+	struct skcipher_request         fallback_req;   /* keep at the end */
+};
+
 struct aspeed_hace_dev {
 	void __iomem			*regs;
 	struct device			*dev;
@@ -155,8 +263,10 @@ struct aspeed_hace_dev {
 	unsigned long			version;
 
 	struct crypto_engine		*crypt_engine_hash;
+	struct crypto_engine		*crypt_engine_crypto;
 
 	struct aspeed_engine_hash	hash_engine;
+	struct aspeed_engine_crypto	crypto_engine;
 };
 
 struct aspeed_hace_alg {
@@ -182,5 +292,7 @@ enum aspeed_version {
 
 void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
 void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev);
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
@ 2022-07-26 11:34   ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-26 11:34 UTC (permalink / raw)
  To: Corentin Labbe, Christophe JAILLET, Randy Dunlap, Herbert Xu,
	David S . Miller, Rob Herring, Krzysztof Kozlowski, Joel Stanley,
	Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Add HACE crypto driver to support symmetric-key
encryption and decryption with multiple modes of
operation.

Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
---
 drivers/crypto/aspeed/Kconfig              |   26 +
 drivers/crypto/aspeed/Makefile             |    7 +-
 drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121 ++++++++++++++++++++
 drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
 drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
 5 files changed, 1354 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c

diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
index 059e627efef8..f19994915a5e 100644
--- a/drivers/crypto/aspeed/Kconfig
+++ b/drivers/crypto/aspeed/Kconfig
@@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
 	  to ask for those messages.
 	  Avoid enabling this option for production build to
 	  minimize driver timing.
+
+config CRYPTO_DEV_ASPEED_HACE_CRYPTO
+	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
+	depends on CRYPTO_DEV_ASPEED
+	select CRYPTO_ENGINE
+	select CRYPTO_AES
+	select CRYPTO_DES
+	select CRYPTO_ECB
+	select CRYPTO_CBC
+	select CRYPTO_CFB
+	select CRYPTO_OFB
+	select CRYPTO_CTR
+	help
+	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
+	  crypto driver.
+	  Supports AES/DES symmetric-key encryption and decryption
+	  with ECB/CBC/CFB/OFB/CTR options.
+
+config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
+	bool "Enable HACE crypto debug messages"
+	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
+	help
+	  Print HACE crypto debugging messages if you use this option
+	  to ask for those messages.
+	  Avoid enabling this option for production build to
+	  minimize driver timing.
diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
index 8bc8d4fed5a9..421e2ca9c53e 100644
--- a/drivers/crypto/aspeed/Makefile
+++ b/drivers/crypto/aspeed/Makefile
@@ -1,6 +1,9 @@
 obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
-aspeed_crypto-objs := aspeed-hace.o \
-		      $(hace-hash-y)
+aspeed_crypto-objs := aspeed-hace.o	\
+		      $(hace-hash-y)	\
+		      $(hace-crypto-y)
 
 obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
 hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
+obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) += aspeed-hace-crypto.o
+hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) := aspeed-hace-crypto.o
diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
new file mode 100644
index 000000000000..1d019b523399
--- /dev/null
+++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c
@@ -0,0 +1,1121 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 Aspeed Technology Inc.
+ */
+
+#include "aspeed-hace.h"
+
+#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
+#define CIPHER_DBG(h, fmt, ...)	\
+	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#else
+#define CIPHER_DBG(h, fmt, ...)	\
+	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
+#endif
+
+static int aspeed_crypto_do_fallback(struct skcipher_request *areq)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(areq);
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	int err;
+
+	skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback_tfm);
+	skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+				      areq->base.complete, areq->base.data);
+	skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
+				   areq->cryptlen, areq->iv);
+
+	if (rctx->enc_cmd & HACE_CMD_ENCRYPT)
+		err = crypto_skcipher_encrypt(&rctx->fallback_req);
+	else
+		err = crypto_skcipher_decrypt(&rctx->fallback_req);
+
+	return err;
+}
+
+static bool aspeed_crypto_need_fallback(struct skcipher_request *areq)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(areq);
+
+	if (areq->cryptlen == 0)
+		return true;
+
+	if ((rctx->enc_cmd & HACE_CMD_DES_SELECT) &&
+	    !IS_ALIGNED(areq->cryptlen, DES_BLOCK_SIZE))
+		return true;
+
+	if ((!(rctx->enc_cmd & HACE_CMD_DES_SELECT)) &&
+	    !IS_ALIGNED(areq->cryptlen, AES_BLOCK_SIZE))
+		return true;
+
+	return false;
+}
+
+static int aspeed_hace_crypto_handle_queue(struct aspeed_hace_dev *hace_dev,
+					   struct skcipher_request *req)
+{
+	if (hace_dev->version == AST2500_VERSION &&
+	    aspeed_crypto_need_fallback(req)) {
+		CIPHER_DBG(hace_dev, "SW fallback\n");
+		return aspeed_crypto_do_fallback(req);
+	}
+
+	return crypto_transfer_skcipher_request_to_engine(
+			hace_dev->crypt_engine_crypto, req);
+}
+
+static int aspeed_crypto_do_request(struct crypto_engine *engine, void *areq)
+{
+	struct skcipher_request *req = skcipher_request_cast(areq);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	struct aspeed_engine_crypto *crypto_engine;
+	int rc;
+
+	crypto_engine = &hace_dev->crypto_engine;
+	crypto_engine->req = req;
+	crypto_engine->flags |= CRYPTO_FLAGS_BUSY;
+
+	rc = ctx->start(hace_dev);
+
+	if (rc != -EINPROGRESS)
+		return -EIO;
+
+	return 0;
+}
+
+static int aspeed_sk_complete(struct aspeed_hace_dev *hace_dev, int err)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	if (rctx->enc_cmd & HACE_CMD_IV_REQUIRE) {
+		if (rctx->enc_cmd & HACE_CMD_DES_SELECT)
+			memcpy(req->iv, crypto_engine->cipher_ctx +
+			       DES_KEY_SIZE, DES_KEY_SIZE);
+		else
+			memcpy(req->iv, crypto_engine->cipher_ctx,
+			       AES_BLOCK_SIZE);
+	}
+
+	crypto_engine->flags &= ~CRYPTO_FLAGS_BUSY;
+
+	crypto_finalize_skcipher_request(hace_dev->crypt_engine_crypto, req,
+					 err);
+
+	return err;
+}
+
+static int aspeed_sk_transfer_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct device *dev = hace_dev->dev;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	if (req->src == req->dst) {
+		dma_unmap_sg(dev, req->src, rctx->src_nents, DMA_BIDIRECTIONAL);
+	} else {
+		dma_unmap_sg(dev, req->src, rctx->src_nents, DMA_TO_DEVICE);
+		dma_unmap_sg(dev, req->dst, rctx->dst_nents, DMA_FROM_DEVICE);
+	}
+
+	return aspeed_sk_complete(hace_dev, 0);
+}
+
+static int aspeed_sk_transfer(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *out_sg;
+	int nbytes = 0;
+	int rc = 0;
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	out_sg = req->dst;
+
+	/* Copy output buffer to dst scatter-gather lists */
+	nbytes = sg_copy_from_buffer(out_sg, rctx->dst_nents,
+				     crypto_engine->cipher_addr, req->cryptlen);
+	if (!nbytes) {
+		dev_warn(hace_dev->dev, "invalid sg copy, %s:0x%x, %s:0x%x\n",
+			 "nbytes", nbytes, "cryptlen", req->cryptlen);
+		rc = -EINVAL;
+	}
+
+	CIPHER_DBG(hace_dev, "%s:%d, %s:%d, %s:%d, %s:%p\n",
+		   "nbytes", nbytes, "req->cryptlen", req->cryptlen,
+		   "nb_out_sg", rctx->dst_nents,
+		   "cipher addr", crypto_engine->cipher_addr);
+
+	return aspeed_sk_complete(hace_dev, rc);
+}
+
+static int aspeed_sk_start(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *in_sg;
+	int nbytes;
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	in_sg = req->src;
+
+	nbytes = sg_copy_to_buffer(in_sg, rctx->src_nents,
+				   crypto_engine->cipher_addr, req->cryptlen);
+
+	CIPHER_DBG(hace_dev, "%s:%d, %s:%d, %s:%d, %s:%p\n",
+		   "nbytes", nbytes, "req->cryptlen", req->cryptlen,
+		   "nb_in_sg", rctx->src_nents,
+		   "cipher addr", crypto_engine->cipher_addr);
+
+	if (!nbytes) {
+		dev_warn(hace_dev->dev, "invalid sg copy, %s:0x%x, %s:0x%x\n",
+			 "nbytes", nbytes, "cryptlen", req->cryptlen);
+		return -EINVAL;
+	}
+
+	crypto_engine->resume = aspeed_sk_transfer;
+
+	/* Trigger engines */
+	ast_hace_write(hace_dev, crypto_engine->cipher_dma_addr,
+		       ASPEED_HACE_SRC);
+	ast_hace_write(hace_dev, crypto_engine->cipher_dma_addr,
+		       ASPEED_HACE_DEST);
+	ast_hace_write(hace_dev, req->cryptlen, ASPEED_HACE_DATA_LEN);
+	ast_hace_write(hace_dev, rctx->enc_cmd, ASPEED_HACE_CMD);
+
+	return -EINPROGRESS;
+}
+
+static int aspeed_sk_start_sg(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_sg_list *src_list, *dst_list;
+	dma_addr_t src_dma_addr, dst_dma_addr;
+	struct aspeed_cipher_reqctx *rctx;
+	struct skcipher_request *req;
+	struct scatterlist *s;
+	int src_sg_len;
+	int dst_sg_len;
+	int total, i;
+	int rc;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+
+	rctx->enc_cmd |= HACE_CMD_DES_SG_CTRL | HACE_CMD_SRC_SG_CTRL |
+			 HACE_CMD_AES_KEY_HW_EXP | HACE_CMD_MBUS_REQ_SYNC_EN;
+
+	/* BIDIRECTIONAL */
+	if (req->dst == req->src) {
+		src_sg_len = dma_map_sg(hace_dev->dev, req->src,
+					rctx->src_nents, DMA_BIDIRECTIONAL);
+		dst_sg_len = src_sg_len;
+		if (!src_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+			return -EINVAL;
+		}
+
+	} else {
+		src_sg_len = dma_map_sg(hace_dev->dev, req->src,
+					rctx->src_nents, DMA_TO_DEVICE);
+		if (!src_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
+			return -EINVAL;
+		}
+
+		dst_sg_len = dma_map_sg(hace_dev->dev, req->dst,
+					rctx->dst_nents, DMA_FROM_DEVICE);
+		if (!dst_sg_len) {
+			dev_warn(hace_dev->dev, "dma_map_sg() dst error\n");
+			rc = -EINVAL;
+			goto free_req_src;
+		}
+	}
+
+	src_list = (struct aspeed_sg_list *)crypto_engine->cipher_addr;
+	src_dma_addr = crypto_engine->cipher_dma_addr;
+	total = req->cryptlen;
+
+	for_each_sg(req->src, s, src_sg_len, i) {
+		src_list[i].phy_addr = sg_dma_address(s);
+
+		if (total > sg_dma_len(s)) {
+			src_list[i].len = sg_dma_len(s);
+			total -= src_list[i].len;
+
+		} else {
+			/* last sg list */
+			src_list[i].len = total;
+			src_list[i].len |= BIT(31);
+			total = 0;
+		}
+
+		src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
+		src_list[i].len = cpu_to_le32(src_list[i].len);
+	}
+
+	if (total != 0) {
+		rc = -EINVAL;
+		goto free_req;
+	}
+
+	if (req->dst == req->src) {
+		dst_list = src_list;
+		dst_dma_addr = src_dma_addr;
+
+	} else {
+		dst_list = (struct aspeed_sg_list *)crypto_engine->dst_sg_addr;
+		dst_dma_addr = crypto_engine->dst_sg_dma_addr;
+		total = req->cryptlen;
+
+		for_each_sg(req->dst, s, dst_sg_len, i) {
+			dst_list[i].phy_addr = sg_dma_address(s);
+
+			if (total > sg_dma_len(s)) {
+				dst_list[i].len = sg_dma_len(s);
+				total -= dst_list[i].len;
+
+			} else {
+				/* last sg list */
+				dst_list[i].len = total;
+				dst_list[i].len |= BIT(31);
+				total = 0;
+			}
+
+			dst_list[i].phy_addr = cpu_to_le32(dst_list[i].phy_addr);
+			dst_list[i].len = cpu_to_le32(dst_list[i].len);
+
+		}
+
+		dst_list[dst_sg_len].phy_addr = 0;
+		dst_list[dst_sg_len].len = 0;
+	}
+
+	if (total != 0) {
+		rc = -EINVAL;
+		goto free_req;
+	}
+
+	crypto_engine->resume = aspeed_sk_transfer_sg;
+
+	/* Memory barrier to ensure all data setup before engine starts */
+	mb();
+
+	/* Trigger engines */
+	ast_hace_write(hace_dev, src_dma_addr, ASPEED_HACE_SRC);
+	ast_hace_write(hace_dev, dst_dma_addr, ASPEED_HACE_DEST);
+	ast_hace_write(hace_dev, req->cryptlen, ASPEED_HACE_DATA_LEN);
+	ast_hace_write(hace_dev, rctx->enc_cmd, ASPEED_HACE_CMD);
+
+	return -EINPROGRESS;
+
+free_req:
+	if (req->dst == req->src) {
+		dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
+			     DMA_BIDIRECTIONAL);
+
+	} else {
+		dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents,
+			     DMA_TO_DEVICE);
+		dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
+			     DMA_TO_DEVICE);
+	}
+
+	return rc;
+
+free_req_src:
+	dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents, DMA_TO_DEVICE);
+
+	return rc;
+}
+
+static int aspeed_hace_skcipher_trigger(struct aspeed_hace_dev *hace_dev)
+{
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+	struct aspeed_cipher_reqctx *rctx;
+	struct crypto_skcipher *cipher;
+	struct aspeed_cipher_ctx *ctx;
+	struct skcipher_request *req;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	req = crypto_engine->req;
+	rctx = skcipher_request_ctx(req);
+	cipher = crypto_skcipher_reqtfm(req);
+	ctx = crypto_skcipher_ctx(cipher);
+
+	/* enable interrupt */
+	rctx->enc_cmd |= HACE_CMD_ISR_EN;
+
+	rctx->dst_nents = sg_nents(req->dst);
+	rctx->src_nents = sg_nents(req->src);
+
+	ast_hace_write(hace_dev, crypto_engine->cipher_ctx_dma,
+		       ASPEED_HACE_CONTEXT);
+
+	if (rctx->enc_cmd & HACE_CMD_IV_REQUIRE) {
+		if (rctx->enc_cmd & HACE_CMD_DES_SELECT)
+			memcpy(crypto_engine->cipher_ctx + DES_BLOCK_SIZE,
+			       req->iv, DES_BLOCK_SIZE);
+		else
+			memcpy(crypto_engine->cipher_ctx, req->iv,
+			       AES_BLOCK_SIZE);
+	}
+
+	if (hace_dev->version == AST2600_VERSION) {
+		memcpy(crypto_engine->cipher_ctx + 16, ctx->key, ctx->key_len);
+
+		return aspeed_sk_start_sg(hace_dev);
+	}
+
+	memcpy(crypto_engine->cipher_ctx + 16, ctx->key, AES_MAX_KEYLENGTH);
+
+	return aspeed_sk_start(hace_dev);
+}
+
+static int aspeed_des_crypt(struct skcipher_request *req, u32 cmd)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(req);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	u32 crypto_alg = cmd & HACE_CMD_OP_MODE_MASK;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	if (crypto_alg == HACE_CMD_CBC || crypto_alg == HACE_CMD_ECB) {
+		if (!IS_ALIGNED(req->cryptlen, DES_BLOCK_SIZE))
+			return -EINVAL;
+	}
+
+	rctx->enc_cmd = cmd | HACE_CMD_DES_SELECT | HACE_CMD_RI_WO_DATA_ENABLE |
+			HACE_CMD_DES | HACE_CMD_CONTEXT_LOAD_ENABLE |
+			HACE_CMD_CONTEXT_SAVE_ENABLE;
+
+	return aspeed_hace_crypto_handle_queue(hace_dev, req);
+}
+
+static int aspeed_des_setkey(struct crypto_skcipher *cipher, const u8 *key,
+			     unsigned int keylen)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	int rc;
+
+	CIPHER_DBG(hace_dev, "keylen: %d bits\n", keylen);
+
+	if (keylen != DES_KEY_SIZE && keylen != DES3_EDE_KEY_SIZE) {
+		dev_warn(hace_dev->dev, "invalid keylen: %d bits\n", keylen);
+		return -EINVAL;
+	}
+
+	if (keylen == DES_KEY_SIZE) {
+		rc = crypto_des_verify_key(tfm, key);
+		if (rc)
+			return rc;
+
+	} else if (keylen == DES3_EDE_KEY_SIZE) {
+		rc = crypto_des3_ede_verify_key(tfm, key);
+		if (rc)
+			return rc;
+	}
+
+	memcpy(ctx->key, key, keylen);
+	ctx->key_len = keylen;
+
+	crypto_skcipher_clear_flags(ctx->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(ctx->fallback_tfm, cipher->base.crt_flags &
+				  CRYPTO_TFM_REQ_MASK);
+
+	return crypto_skcipher_setkey(ctx->fallback_tfm, key, keylen);
+}
+
+static int aspeed_tdes_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_tdes_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB |
+				HACE_CMD_TRIPLE_DES);
+}
+
+static int aspeed_des_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_des_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_des_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB |
+				HACE_CMD_SINGLE_DES);
+}
+
+static int aspeed_aes_crypt(struct skcipher_request *req, u32 cmd)
+{
+	struct aspeed_cipher_reqctx *rctx = skcipher_request_ctx(req);
+	struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req);
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	u32 crypto_alg = cmd & HACE_CMD_OP_MODE_MASK;
+
+	if (crypto_alg == HACE_CMD_CBC || crypto_alg == HACE_CMD_ECB) {
+		if (!IS_ALIGNED(req->cryptlen, AES_BLOCK_SIZE))
+			return -EINVAL;
+	}
+
+	CIPHER_DBG(hace_dev, "%s\n",
+		   (cmd & HACE_CMD_ENCRYPT) ? "encrypt" : "decrypt");
+
+	cmd |= HACE_CMD_AES_SELECT | HACE_CMD_RI_WO_DATA_ENABLE |
+	       HACE_CMD_CONTEXT_LOAD_ENABLE | HACE_CMD_CONTEXT_SAVE_ENABLE;
+
+	switch (ctx->key_len) {
+	case AES_KEYSIZE_128:
+		cmd |= HACE_CMD_AES128;
+		break;
+	case AES_KEYSIZE_192:
+		cmd |= HACE_CMD_AES192;
+		break;
+	case AES_KEYSIZE_256:
+		cmd |= HACE_CMD_AES256;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	rctx->enc_cmd = cmd;
+
+	return aspeed_hace_crypto_handle_queue(hace_dev, req);
+}
+
+static int aspeed_aes_setkey(struct crypto_skcipher *cipher, const u8 *key,
+			     unsigned int keylen)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(cipher);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+	struct crypto_aes_ctx gen_aes_key;
+
+	CIPHER_DBG(hace_dev, "keylen: %d bits\n", (keylen * 8));
+
+	if (keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_192 &&
+	    keylen != AES_KEYSIZE_256)
+		return -EINVAL;
+
+	if (ctx->hace_dev->version == AST2500_VERSION) {
+		aes_expandkey(&gen_aes_key, key, keylen);
+		memcpy(ctx->key, gen_aes_key.key_enc, AES_MAX_KEYLENGTH);
+
+	} else {
+		memcpy(ctx->key, key, keylen);
+	}
+
+	ctx->key_len = keylen;
+
+	crypto_skcipher_clear_flags(ctx->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(ctx->fallback_tfm, cipher->base.crt_flags &
+				  CRYPTO_TFM_REQ_MASK);
+
+	return crypto_skcipher_setkey(ctx->fallback_tfm, key, keylen);
+}
+
+static int aspeed_aes_ctr_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CTR);
+}
+
+static int aspeed_aes_ctr_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CTR);
+}
+
+static int aspeed_aes_ofb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_OFB);
+}
+
+static int aspeed_aes_ofb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_OFB);
+}
+
+static int aspeed_aes_cfb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CFB);
+}
+
+static int aspeed_aes_cfb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CFB);
+}
+
+static int aspeed_aes_cbc_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_CBC);
+}
+
+static int aspeed_aes_cbc_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_CBC);
+}
+
+static int aspeed_aes_ecb_decrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_DECRYPT | HACE_CMD_ECB);
+}
+
+static int aspeed_aes_ecb_encrypt(struct skcipher_request *req)
+{
+	return aspeed_aes_crypt(req, HACE_CMD_ENCRYPT | HACE_CMD_ECB);
+}
+
+static int aspeed_crypto_cra_init(struct crypto_skcipher *tfm)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+	const char *name = crypto_tfm_alg_name(&tfm->base);
+	struct aspeed_hace_alg *crypto_alg;
+
+
+	crypto_alg = container_of(alg, struct aspeed_hace_alg, alg.skcipher);
+	ctx->hace_dev = crypto_alg->hace_dev;
+	ctx->start = aspeed_hace_skcipher_trigger;
+
+	CIPHER_DBG(ctx->hace_dev, "%s\n", name);
+
+	ctx->fallback_tfm = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_ASYNC |
+						  CRYPTO_ALG_NEED_FALLBACK);
+	if (IS_ERR(ctx->fallback_tfm)) {
+		dev_err(ctx->hace_dev->dev, "ERROR: Cannot allocate fallback for %s %ld\n",
+			name, PTR_ERR(ctx->fallback_tfm));
+		return PTR_ERR(ctx->fallback_tfm);
+	}
+
+	crypto_skcipher_set_reqsize(tfm, sizeof(struct aspeed_cipher_reqctx) +
+			 crypto_skcipher_reqsize(ctx->fallback_tfm));
+
+	ctx->enginectx.op.do_one_request = aspeed_crypto_do_request;
+	ctx->enginectx.op.prepare_request = NULL;
+	ctx->enginectx.op.unprepare_request = NULL;
+
+	return 0;
+}
+
+static void aspeed_crypto_cra_exit(struct crypto_skcipher *tfm)
+{
+	struct aspeed_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct aspeed_hace_dev *hace_dev = ctx->hace_dev;
+
+	CIPHER_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(&tfm->base));
+	crypto_free_skcipher(ctx->fallback_tfm);
+}
+
+struct aspeed_hace_alg aspeed_crypto_algs[] = {
+	{
+		.alg.skcipher = {
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ecb_encrypt,
+			.decrypt	= aspeed_aes_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(aes)",
+				.cra_driver_name	= "aspeed-ecb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= AES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_cbc_encrypt,
+			.decrypt	= aspeed_aes_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(aes)",
+				.cra_driver_name	= "aspeed-cbc-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= AES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_cfb_encrypt,
+			.decrypt	= aspeed_aes_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(aes)",
+				.cra_driver_name	= "aspeed-cfb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ofb_encrypt,
+			.decrypt	= aspeed_aes_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(aes)",
+				.cra_driver_name	= "aspeed-ofb-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ecb_encrypt,
+			.decrypt	= aspeed_des_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(des)",
+				.cra_driver_name	= "aspeed-ecb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_cbc_encrypt,
+			.decrypt	= aspeed_des_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(des)",
+				.cra_driver_name	= "aspeed-cbc-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_cfb_encrypt,
+			.decrypt	= aspeed_des_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(des)",
+				.cra_driver_name	= "aspeed-cfb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ofb_encrypt,
+			.decrypt	= aspeed_des_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(des)",
+				.cra_driver_name	= "aspeed-ofb-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ecb_encrypt,
+			.decrypt	= aspeed_tdes_ecb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ecb(des3_ede)",
+				.cra_driver_name	= "aspeed-ecb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_cbc_encrypt,
+			.decrypt	= aspeed_tdes_cbc_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cbc(des3_ede)",
+				.cra_driver_name	= "aspeed-cbc-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_cfb_encrypt,
+			.decrypt	= aspeed_tdes_cfb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "cfb(des3_ede)",
+				.cra_driver_name	= "aspeed-cfb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ofb_encrypt,
+			.decrypt	= aspeed_tdes_ofb_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ofb(des3_ede)",
+				.cra_driver_name	= "aspeed-ofb-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC |
+							  CRYPTO_ALG_NEED_FALLBACK,
+				.cra_blocksize		= DES_BLOCK_SIZE,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+};
+
+struct aspeed_hace_alg aspeed_crypto_algs_g6[] = {
+	{
+		.alg.skcipher = {
+			.ivsize		= AES_BLOCK_SIZE,
+			.min_keysize	= AES_MIN_KEY_SIZE,
+			.max_keysize	= AES_MAX_KEY_SIZE,
+			.setkey		= aspeed_aes_setkey,
+			.encrypt	= aspeed_aes_ctr_encrypt,
+			.decrypt	= aspeed_aes_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(aes)",
+				.cra_driver_name	= "aspeed-ctr-aes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES_KEY_SIZE,
+			.max_keysize	= DES_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_des_ctr_encrypt,
+			.decrypt	= aspeed_des_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(des)",
+				.cra_driver_name	= "aspeed-ctr-des",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+	{
+		.alg.skcipher = {
+			.ivsize		= DES_BLOCK_SIZE,
+			.min_keysize	= DES3_EDE_KEY_SIZE,
+			.max_keysize	= DES3_EDE_KEY_SIZE,
+			.setkey		= aspeed_des_setkey,
+			.encrypt	= aspeed_tdes_ctr_encrypt,
+			.decrypt	= aspeed_tdes_ctr_decrypt,
+			.init		= aspeed_crypto_cra_init,
+			.exit		= aspeed_crypto_cra_exit,
+			.base = {
+				.cra_name		= "ctr(des3_ede)",
+				.cra_driver_name	= "aspeed-ctr-tdes",
+				.cra_priority		= 300,
+				.cra_flags		= CRYPTO_ALG_KERN_DRIVER_ONLY |
+							  CRYPTO_ALG_ASYNC,
+				.cra_blocksize		= 1,
+				.cra_ctxsize		= sizeof(struct aspeed_cipher_ctx),
+				.cra_alignmask		= 0x0f,
+				.cra_module		= THIS_MODULE,
+			}
+		}
+	},
+
+};
+
+void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	int rc, i;
+
+	CIPHER_DBG(hace_dev, "\n");
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
+		aspeed_crypto_algs[i].hace_dev = hace_dev;
+		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
+		if (rc) {
+			CIPHER_DBG(hace_dev, "Failed to register %s\n",
+				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
+		}
+	}
+
+	if (hace_dev->version != AST2600_VERSION)
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
+		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
+		rc = crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
+		if (rc) {
+			CIPHER_DBG(hace_dev, "Failed to register %s\n",
+				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
+		}
+	}
+}
diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
index 89b1585d72e2..efc0725ebf98 100644
--- a/drivers/crypto/aspeed/aspeed-hace.c
+++ b/drivers/crypto/aspeed/aspeed-hace.c
@@ -32,10 +32,22 @@ void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
 	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
 }
 
+/* Weak function for HACE crypto */
+void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
+void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
+{
+	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
+}
+
 /* HACE interrupt service routine */
 static irqreturn_t aspeed_hace_irq(int irq, void *dev)
 {
 	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
 	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
 	u32 sts;
 
@@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
 			dev_warn(hace_dev->dev, "HASH no active requests.\n");
 	}
 
+	if (sts & HACE_CRYPTO_ISR) {
+		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
+			tasklet_schedule(&crypto_engine->done_task);
+		else
+			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
+	}
+
 	return IRQ_HANDLED;
 }
 
+static void aspeed_hace_crypto_done_task(unsigned long data)
+{
+	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
+
+	crypto_engine->resume(hace_dev);
+}
+
 static void aspeed_hace_hash_done_task(unsigned long data)
 {
 	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
@@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned long data)
 static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
 {
 	aspeed_register_hace_hash_algs(hace_dev);
+	aspeed_register_hace_crypto_algs(hace_dev);
 }
 
 static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
 {
 	aspeed_unregister_hace_hash_algs(hace_dev);
+	aspeed_unregister_hace_crypto_algs(hace_dev);
 }
 
 static const struct of_device_id aspeed_hace_of_matches[] = {
@@ -80,6 +109,7 @@ static const struct of_device_id aspeed_hace_of_matches[] = {
 
 static int aspeed_hace_probe(struct platform_device *pdev)
 {
+	struct aspeed_engine_crypto *crypto_engine;
 	const struct of_device_id *hace_dev_id;
 	struct aspeed_engine_hash *hash_engine;
 	struct aspeed_hace_dev *hace_dev;
@@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	hace_dev->dev = &pdev->dev;
 	hace_dev->version = (unsigned long)hace_dev_id->data;
 	hash_engine = &hace_dev->hash_engine;
+	crypto_engine = &hace_dev->crypto_engine;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
@@ -153,6 +184,21 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
 		     (unsigned long)hace_dev);
 
+	/* Initialize crypto hardware engine structure for crypto */
+	hace_dev->crypt_engine_crypto = crypto_engine_alloc_init(hace_dev->dev,
+								 true);
+	if (!hace_dev->crypt_engine_crypto) {
+		rc = -ENOMEM;
+		goto err_engine_hash_start;
+	}
+
+	rc = crypto_engine_start(hace_dev->crypt_engine_crypto);
+	if (rc)
+		goto err_engine_crypto_start;
+
+	tasklet_init(&crypto_engine->done_task, aspeed_hace_crypto_done_task,
+		     (unsigned long)hace_dev);
+
 	/* Allocate DMA buffer for hash engine input used */
 	hash_engine->ahash_src_addr =
 		dmam_alloc_coherent(&pdev->dev,
@@ -162,7 +208,45 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 	if (!hash_engine->ahash_src_addr) {
 		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
 		rc = -ENOMEM;
-		goto err_engine_hash_start;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine context used */
+	crypto_engine->cipher_ctx =
+		dmam_alloc_coherent(&pdev->dev,
+				    PAGE_SIZE,
+				    &crypto_engine->cipher_ctx_dma,
+				    GFP_KERNEL);
+	if (!crypto_engine->cipher_ctx) {
+		dev_err(&pdev->dev, "Failed to allocate cipher ctx dma\n");
+		rc = -ENOMEM;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine input used */
+	crypto_engine->cipher_addr =
+		dmam_alloc_coherent(&pdev->dev,
+				    ASPEED_CRYPTO_SRC_DMA_BUF_LEN,
+				    &crypto_engine->cipher_dma_addr,
+				    GFP_KERNEL);
+	if (!crypto_engine->cipher_addr) {
+		dev_err(&pdev->dev, "Failed to allocate cipher addr dma\n");
+		rc = -ENOMEM;
+		goto err_engine_crypto_start;
+	}
+
+	/* Allocate DMA buffer for crypto engine output used */
+	if (hace_dev->version == AST2600_VERSION) {
+		crypto_engine->dst_sg_addr =
+			dmam_alloc_coherent(&pdev->dev,
+					    ASPEED_CRYPTO_DST_DMA_BUF_LEN,
+					    &crypto_engine->dst_sg_dma_addr,
+					    GFP_KERNEL);
+		if (!crypto_engine->dst_sg_addr) {
+			dev_err(&pdev->dev, "Failed to allocate dst_sg dma\n");
+			rc = -ENOMEM;
+			goto err_engine_crypto_start;
+		}
 	}
 
 	aspeed_hace_register(hace_dev);
@@ -171,6 +255,8 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 
 	return 0;
 
+err_engine_crypto_start:
+	crypto_engine_exit(hace_dev->crypt_engine_crypto);
 err_engine_hash_start:
 	crypto_engine_exit(hace_dev->crypt_engine_hash);
 clk_exit:
@@ -182,13 +268,16 @@ static int aspeed_hace_probe(struct platform_device *pdev)
 static int aspeed_hace_remove(struct platform_device *pdev)
 {
 	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
+	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
 	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
 
 	aspeed_hace_unregister(hace_dev);
 
 	crypto_engine_exit(hace_dev->crypt_engine_hash);
+	crypto_engine_exit(hace_dev->crypt_engine_crypto);
 
 	tasklet_kill(&hash_engine->done_task);
+	tasklet_kill(&crypto_engine->done_task);
 
 	clk_disable_unprepare(hace_dev->clk);
 
diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
index 3494ff22f69d..f2cde23b56ae 100644
--- a/drivers/crypto/aspeed/aspeed-hace.h
+++ b/drivers/crypto/aspeed/aspeed-hace.h
@@ -7,9 +7,12 @@
 #include <linux/err.h>
 #include <linux/fips.h>
 #include <linux/dma-mapping.h>
+#include <crypto/aes.h>
+#include <crypto/des.h>
 #include <crypto/scatterwalk.h>
 #include <crypto/internal/aead.h>
 #include <crypto/internal/akcipher.h>
+#include <crypto/internal/des.h>
 #include <crypto/internal/hash.h>
 #include <crypto/internal/kpp.h>
 #include <crypto/internal/skcipher.h>
@@ -24,15 +27,75 @@
  * HACE register definitions *
  *                           *
  * ***************************/
+#define ASPEED_HACE_SRC			0x00	/* Crypto Data Source Base Address Register */
+#define ASPEED_HACE_DEST		0x04	/* Crypto Data Destination Base Address Register */
+#define ASPEED_HACE_CONTEXT		0x08	/* Crypto Context Buffer Base Address Register */
+#define ASPEED_HACE_DATA_LEN		0x0C	/* Crypto Data Length Register */
+#define ASPEED_HACE_CMD			0x10	/* Crypto Engine Command Register */
+
+/* G5 */
+#define ASPEED_HACE_TAG			0x18	/* HACE Tag Register */
+/* G6 */
+#define ASPEED_HACE_GCM_ADD_LEN		0x14	/* Crypto AES-GCM Additional Data Length Register */
+#define ASPEED_HACE_GCM_TAG_BASE_ADDR	0x18	/* Crypto AES-GCM Tag Write Buff Base Address Reg */
 
 #define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
+
 #define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
 #define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
 #define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
 #define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
 #define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
 
+/* crypto cmd */
+#define  HACE_CMD_SINGLE_DES		0
+#define  HACE_CMD_TRIPLE_DES		BIT(17)
+#define  HACE_CMD_AES_SELECT		0
+#define  HACE_CMD_DES_SELECT		BIT(16)
+#define  HACE_CMD_ISR_EN		BIT(12)
+#define  HACE_CMD_CONTEXT_SAVE_ENABLE	(0)
+#define  HACE_CMD_CONTEXT_SAVE_DISABLE	BIT(9)
+#define  HACE_CMD_AES			(0)
+#define  HACE_CMD_DES			(0)
+#define  HACE_CMD_RC4			BIT(8)
+#define  HACE_CMD_DECRYPT		(0)
+#define  HACE_CMD_ENCRYPT		BIT(7)
+
+#define  HACE_CMD_ECB			(0x0 << 4)
+#define  HACE_CMD_CBC			(0x1 << 4)
+#define  HACE_CMD_CFB			(0x2 << 4)
+#define  HACE_CMD_OFB			(0x3 << 4)
+#define  HACE_CMD_CTR			(0x4 << 4)
+#define  HACE_CMD_OP_MODE_MASK		(0x7 << 4)
+
+#define  HACE_CMD_AES128		(0x0 << 2)
+#define  HACE_CMD_AES192		(0x1 << 2)
+#define  HACE_CMD_AES256		(0x2 << 2)
+#define  HACE_CMD_OP_CASCADE		(0x3)
+#define  HACE_CMD_OP_INDEPENDENT	(0x1)
+
+/* G5 */
+#define  HACE_CMD_RI_WO_DATA_ENABLE	(0)
+#define  HACE_CMD_RI_WO_DATA_DISABLE	BIT(11)
+#define  HACE_CMD_CONTEXT_LOAD_ENABLE	(0)
+#define  HACE_CMD_CONTEXT_LOAD_DISABLE	BIT(10)
+/* G6 */
+#define  HACE_CMD_AES_KEY_FROM_OTP	BIT(24)
+#define  HACE_CMD_GHASH_TAG_XOR_EN	BIT(23)
+#define  HACE_CMD_GHASH_PAD_LEN_INV	BIT(22)
+#define  HACE_CMD_GCM_TAG_ADDR_SEL	BIT(21)
+#define  HACE_CMD_MBUS_REQ_SYNC_EN	BIT(20)
+#define  HACE_CMD_DES_SG_CTRL		BIT(19)
+#define  HACE_CMD_SRC_SG_CTRL		BIT(18)
+#define  HACE_CMD_CTR_IV_AES_96		(0x1 << 14)
+#define  HACE_CMD_CTR_IV_DES_32		(0x1 << 14)
+#define  HACE_CMD_CTR_IV_AES_64		(0x2 << 14)
+#define  HACE_CMD_CTR_IV_AES_32		(0x3 << 14)
+#define  HACE_CMD_AES_KEY_HW_EXP	BIT(13)
+#define  HACE_CMD_GCM			(0x5 << 4)
+
 /* interrupt status reg */
+#define  HACE_CRYPTO_ISR		BIT(12)
 #define  HACE_HASH_ISR			BIT(9)
 #define  HACE_HASH_BUSY			BIT(0)
 
@@ -77,6 +140,9 @@
 #define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
 #define ASPEED_HASH_QUEUE_LENGTH	50
 
+#define HACE_CMD_IV_REQUIRE		(HACE_CMD_CBC | HACE_CMD_CFB | \
+					 HACE_CMD_OFB | HACE_CMD_CTR)
+
 struct aspeed_hace_dev;
 
 typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
@@ -147,6 +213,48 @@ struct aspeed_sham_reqctx {
 	u64			digcnt[2];
 };
 
+struct aspeed_engine_crypto {
+	struct tasklet_struct		done_task;
+	unsigned long			flags;
+	struct skcipher_request		*req;
+
+	/* context buffer */
+	void				*cipher_ctx;
+	dma_addr_t			cipher_ctx_dma;
+
+	/* input buffer, could be single/scatter-gather lists */
+	void				*cipher_addr;
+	dma_addr_t			cipher_dma_addr;
+
+	/* output buffer, only used in scatter-gather lists */
+	void				*dst_sg_addr;
+	dma_addr_t			dst_sg_dma_addr;
+
+	/* callback func */
+	aspeed_hace_fn_t		resume;
+};
+
+struct aspeed_cipher_ctx {
+	struct crypto_engine_ctx	enginectx;
+
+	struct aspeed_hace_dev		*hace_dev;
+	int				key_len;
+	u8				key[AES_MAX_KEYLENGTH];
+
+	/* callback func */
+	aspeed_hace_fn_t		start;
+
+	struct crypto_skcipher          *fallback_tfm;
+};
+
+struct aspeed_cipher_reqctx {
+	int enc_cmd;
+	int src_nents;
+	int dst_nents;
+
+	struct skcipher_request         fallback_req;   /* keep at the end */
+};
+
 struct aspeed_hace_dev {
 	void __iomem			*regs;
 	struct device			*dev;
@@ -155,8 +263,10 @@ struct aspeed_hace_dev {
 	unsigned long			version;
 
 	struct crypto_engine		*crypt_engine_hash;
+	struct crypto_engine		*crypt_engine_crypto;
 
 	struct aspeed_engine_hash	hash_engine;
+	struct aspeed_engine_crypto	crypto_engine;
 };
 
 struct aspeed_hace_alg {
@@ -182,5 +292,7 @@ enum aspeed_version {
 
 void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
 void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev);
+void aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev);
 
 #endif
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
  2022-07-26 11:34   ` Neal Liu
@ 2022-07-26 20:41     ` Dhananjay Phadke
  -1 siblings, 0 replies; 32+ messages in thread
From: Dhananjay Phadke @ 2022-07-26 20:41 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Hi Neal,

Thanks for addressing v7 review comments, few more below.

On 7/26/2022 4:34 AM, Neal Liu wrote:
> Add HACE crypto driver to support symmetric-key
> encryption and decryption with multiple modes of
> operation.
> 
> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> ---
>   drivers/crypto/aspeed/Kconfig              |   26 +
>   drivers/crypto/aspeed/Makefile             |    7 +-
>   drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121 ++++++++++++++++++++
>   drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
>   drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
>   5 files changed, 1354 insertions(+), 3 deletions(-)
>   create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
> 
> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> index 059e627efef8..f19994915a5e 100644
> --- a/drivers/crypto/aspeed/Kconfig
> +++ b/drivers/crypto/aspeed/Kconfig
> @@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>   	  to ask for those messages.
>   	  Avoid enabling this option for production build to
>   	  minimize driver timing.
> +
> +config CRYPTO_DEV_ASPEED_HACE_CRYPTO
> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
> +	depends on CRYPTO_DEV_ASPEED
> +	select CRYPTO_ENGINE
> +	select CRYPTO_AES
> +	select CRYPTO_DES
> +	select CRYPTO_ECB
> +	select CRYPTO_CBC
> +	select CRYPTO_CFB
> +	select CRYPTO_OFB
> +	select CRYPTO_CTR
> +	help
> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> +	  crypto driver.
> +	  Supports AES/DES symmetric-key encryption and decryption
> +	  with ECB/CBC/CFB/OFB/CTR options.
> +
> +config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
> +	bool "Enable HACE crypto debug messages"
> +	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
> +	help
> +	  Print HACE crypto debugging messages if you use this option
> +	  to ask for those messages.
> +	  Avoid enabling this option for production build to
> +	  minimize driver timing.

Why are separate options required for hash and crypto algorithms, if
hace is only hw crypto on the SoCs?

Looks like that's requiring unnecessary __weak register / unregister
functions [see below].

Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and 
CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream defconfigs?

> diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
> index 8bc8d4fed5a9..421e2ca9c53e 100644
> --- a/drivers/crypto/aspeed/Makefile
> +++ b/drivers/crypto/aspeed/Makefile
> @@ -1,6 +1,9 @@
>   obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> -aspeed_crypto-objs := aspeed-hace.o \
> -		      $(hace-hash-y)
> +aspeed_crypto-objs := aspeed-hace.o	\
> +		      $(hace-hash-y)	\
> +		      $(hace-crypto-y)
>   
>   obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
>   hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) += aspeed-hace-crypto.o
> +hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) := aspeed-hace-crypto.o
> diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
> new file mode 100644

[...]

> +
> +void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int rc, i;
> +
> +	CIPHER_DBG(hace_dev, "\n");
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
> +		aspeed_crypto_algs[i].hace_dev = hace_dev;
> +		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
> +		if (rc) {
> +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> +				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
> +		}
> +	}
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
> +		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
> +		rc = crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
> +		if (rc) {
> +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> +				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
> +		}
> +	}
> +}
> diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
> index 89b1585d72e2..efc0725ebf98 100644
> --- a/drivers/crypto/aspeed/aspeed-hace.c
> +++ b/drivers/crypto/aspeed/aspeed-hace.c
> @@ -32,10 +32,22 @@ void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
>   	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
>   }
>   
> +/* Weak function for HACE crypto */
> +void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +

aspeed_unregister_hace_crypto_algs() is not implemented in 
aspeed-hace-crypto.c, so those algorithms are not unregistered during 
unload.

This was missed because of __weak function.

>   /* HACE interrupt service routine */
>   static irqreturn_t aspeed_hace_irq(int irq, void *dev)
>   {
>   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
> +	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
>   	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>   	u32 sts;
>   
> @@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
>   			dev_warn(hace_dev->dev, "HASH no active requests.\n");
>   	}
>   
> +	if (sts & HACE_CRYPTO_ISR) {
> +		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
> +			tasklet_schedule(&crypto_engine->done_task);
> +		else
> +			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
> +	}
> +
>   	return IRQ_HANDLED;
>   }
>   
> +static void aspeed_hace_crypto_done_task(unsigned long data)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> +	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
> +
> +	crypto_engine->resume(hace_dev);
> +}
> +
>   static void aspeed_hace_hash_done_task(unsigned long data)
>   {
>   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> @@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned long data)
>   static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
>   {
>   	aspeed_register_hace_hash_algs(hace_dev);
> +	aspeed_register_hace_crypto_algs(hace_dev);
>   }
>   
>   static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
>   {
>   	aspeed_unregister_hace_hash_algs(hace_dev);
> +	aspeed_unregister_hace_crypto_algs(hace_dev);
>   }

Could just wrap these calls instead of weak functions.

static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
{
#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH
   	aspeed_unregister_hace_hash_algs(hace_dev);
#endif
#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO
	aspeed_unregister_hace_crypto_algs(hace_dev);
#endif
}

>   
>   static const struct of_device_id aspeed_hace_of_matches[] = {
> @@ -80,6 +109,7 @@ static const struct of_device_id aspeed_hace_of_matches[] = {
>   
>   static int aspeed_hace_probe(struct platform_device *pdev)
>   {
> +	struct aspeed_engine_crypto *crypto_engine;
>   	const struct of_device_id *hace_dev_id;
>   	struct aspeed_engine_hash *hash_engine;
>   	struct aspeed_hace_dev *hace_dev;
> @@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device *pdev)
>   	hace_dev->dev = &pdev->dev;
>   	hace_dev->version = (unsigned long)hace_dev_id->data;
>   	hash_engine = &hace_dev->hash_engine;
> +	crypto_engine = &hace_dev->crypto_engine;
>   
>   	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

Thanks,
Dhananjay

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
@ 2022-07-26 20:41     ` Dhananjay Phadke
  0 siblings, 0 replies; 32+ messages in thread
From: Dhananjay Phadke @ 2022-07-26 20:41 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW

Hi Neal,

Thanks for addressing v7 review comments, few more below.

On 7/26/2022 4:34 AM, Neal Liu wrote:
> Add HACE crypto driver to support symmetric-key
> encryption and decryption with multiple modes of
> operation.
> 
> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> ---
>   drivers/crypto/aspeed/Kconfig              |   26 +
>   drivers/crypto/aspeed/Makefile             |    7 +-
>   drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121 ++++++++++++++++++++
>   drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
>   drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
>   5 files changed, 1354 insertions(+), 3 deletions(-)
>   create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
> 
> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> index 059e627efef8..f19994915a5e 100644
> --- a/drivers/crypto/aspeed/Kconfig
> +++ b/drivers/crypto/aspeed/Kconfig
> @@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>   	  to ask for those messages.
>   	  Avoid enabling this option for production build to
>   	  minimize driver timing.
> +
> +config CRYPTO_DEV_ASPEED_HACE_CRYPTO
> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
> +	depends on CRYPTO_DEV_ASPEED
> +	select CRYPTO_ENGINE
> +	select CRYPTO_AES
> +	select CRYPTO_DES
> +	select CRYPTO_ECB
> +	select CRYPTO_CBC
> +	select CRYPTO_CFB
> +	select CRYPTO_OFB
> +	select CRYPTO_CTR
> +	help
> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> +	  crypto driver.
> +	  Supports AES/DES symmetric-key encryption and decryption
> +	  with ECB/CBC/CFB/OFB/CTR options.
> +
> +config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
> +	bool "Enable HACE crypto debug messages"
> +	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
> +	help
> +	  Print HACE crypto debugging messages if you use this option
> +	  to ask for those messages.
> +	  Avoid enabling this option for production build to
> +	  minimize driver timing.

Why are separate options required for hash and crypto algorithms, if
hace is only hw crypto on the SoCs?

Looks like that's requiring unnecessary __weak register / unregister
functions [see below].

Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and 
CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream defconfigs?

> diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
> index 8bc8d4fed5a9..421e2ca9c53e 100644
> --- a/drivers/crypto/aspeed/Makefile
> +++ b/drivers/crypto/aspeed/Makefile
> @@ -1,6 +1,9 @@
>   obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> -aspeed_crypto-objs := aspeed-hace.o \
> -		      $(hace-hash-y)
> +aspeed_crypto-objs := aspeed-hace.o	\
> +		      $(hace-hash-y)	\
> +		      $(hace-crypto-y)
>   
>   obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
>   hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) += aspeed-hace-crypto.o
> +hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) := aspeed-hace-crypto.o
> diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
> new file mode 100644

[...]

> +
> +void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int rc, i;
> +
> +	CIPHER_DBG(hace_dev, "\n");
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
> +		aspeed_crypto_algs[i].hace_dev = hace_dev;
> +		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
> +		if (rc) {
> +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> +				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
> +		}
> +	}
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
> +		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
> +		rc = crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
> +		if (rc) {
> +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> +				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
> +		}
> +	}
> +}
> diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
> index 89b1585d72e2..efc0725ebf98 100644
> --- a/drivers/crypto/aspeed/aspeed-hace.c
> +++ b/drivers/crypto/aspeed/aspeed-hace.c
> @@ -32,10 +32,22 @@ void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
>   	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
>   }
>   
> +/* Weak function for HACE crypto */
> +void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +

aspeed_unregister_hace_crypto_algs() is not implemented in 
aspeed-hace-crypto.c, so those algorithms are not unregistered during 
unload.

This was missed because of __weak function.

>   /* HACE interrupt service routine */
>   static irqreturn_t aspeed_hace_irq(int irq, void *dev)
>   {
>   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
> +	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
>   	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>   	u32 sts;
>   
> @@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
>   			dev_warn(hace_dev->dev, "HASH no active requests.\n");
>   	}
>   
> +	if (sts & HACE_CRYPTO_ISR) {
> +		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
> +			tasklet_schedule(&crypto_engine->done_task);
> +		else
> +			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
> +	}
> +
>   	return IRQ_HANDLED;
>   }
>   
> +static void aspeed_hace_crypto_done_task(unsigned long data)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> +	struct aspeed_engine_crypto *crypto_engine = &hace_dev->crypto_engine;
> +
> +	crypto_engine->resume(hace_dev);
> +}
> +
>   static void aspeed_hace_hash_done_task(unsigned long data)
>   {
>   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> @@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned long data)
>   static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
>   {
>   	aspeed_register_hace_hash_algs(hace_dev);
> +	aspeed_register_hace_crypto_algs(hace_dev);
>   }
>   
>   static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
>   {
>   	aspeed_unregister_hace_hash_algs(hace_dev);
> +	aspeed_unregister_hace_crypto_algs(hace_dev);
>   }

Could just wrap these calls instead of weak functions.

static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
{
#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH
   	aspeed_unregister_hace_hash_algs(hace_dev);
#endif
#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO
	aspeed_unregister_hace_crypto_algs(hace_dev);
#endif
}

>   
>   static const struct of_device_id aspeed_hace_of_matches[] = {
> @@ -80,6 +109,7 @@ static const struct of_device_id aspeed_hace_of_matches[] = {
>   
>   static int aspeed_hace_probe(struct platform_device *pdev)
>   {
> +	struct aspeed_engine_crypto *crypto_engine;
>   	const struct of_device_id *hace_dev_id;
>   	struct aspeed_engine_hash *hash_engine;
>   	struct aspeed_hace_dev *hace_dev;
> @@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device *pdev)
>   	hace_dev->dev = &pdev->dev;
>   	hace_dev->version = (unsigned long)hace_dev_id->data;
>   	hash_engine = &hace_dev->hash_engine;
> +	crypto_engine = &hace_dev->crypto_engine;
>   
>   	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

Thanks,
Dhananjay

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
  2022-07-26 20:41     ` Dhananjay Phadke
@ 2022-07-27  5:31       ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-27  5:31 UTC (permalink / raw)
  To: Dhananjay Phadke, Corentin Labbe, Christophe JAILLET,
	Randy Dunlap, Herbert Xu, David S . Miller, Rob Herring,
	Krzysztof Kozlowski, Joel Stanley, Andrew Jeffery,
	Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> Hi Neal,
> 
> Thanks for addressing v7 review comments, few more below.
> 
> On 7/26/2022 4:34 AM, Neal Liu wrote:
> > Add HACE crypto driver to support symmetric-key encryption and
> > decryption with multiple modes of operation.
> >
> > Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> > Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> > ---
> >   drivers/crypto/aspeed/Kconfig              |   26 +
> >   drivers/crypto/aspeed/Makefile             |    7 +-
> >   drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121
> ++++++++++++++++++++
> >   drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
> >   drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
> >   5 files changed, 1354 insertions(+), 3 deletions(-)
> >   create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
> >
> > diff --git a/drivers/crypto/aspeed/Kconfig
> > b/drivers/crypto/aspeed/Kconfig index 059e627efef8..f19994915a5e
> > 100644
> > --- a/drivers/crypto/aspeed/Kconfig
> > +++ b/drivers/crypto/aspeed/Kconfig
> > @@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >   	  to ask for those messages.
> >   	  Avoid enabling this option for production build to
> >   	  minimize driver timing.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_CRYPTO
> > +	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
> > +	depends on CRYPTO_DEV_ASPEED
> > +	select CRYPTO_ENGINE
> > +	select CRYPTO_AES
> > +	select CRYPTO_DES
> > +	select CRYPTO_ECB
> > +	select CRYPTO_CBC
> > +	select CRYPTO_CFB
> > +	select CRYPTO_OFB
> > +	select CRYPTO_CTR
> > +	help
> > +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> > +	  crypto driver.
> > +	  Supports AES/DES symmetric-key encryption and decryption
> > +	  with ECB/CBC/CFB/OFB/CTR options.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
> > +	bool "Enable HACE crypto debug messages"
> > +	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
> > +	help
> > +	  Print HACE crypto debugging messages if you use this option
> > +	  to ask for those messages.
> > +	  Avoid enabling this option for production build to
> > +	  minimize driver timing.
> 
> Why are separate options required for hash and crypto algorithms, if hace is
> only hw crypto on the SoCs?
> 
> Looks like that's requiring unnecessary __weak register / unregister functions
> [see below].
> 
> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
> defconfigs?

I would like to separate different algorithms by different options for more convenient for further use and debug.
We also have RSA engine named ACRY, and would upstream once hash & crypto being accepted.
So combined them into one option seems not a good choice for multiple hw crypto, do you agree?

> > diff --git a/drivers/crypto/aspeed/Makefile
> > b/drivers/crypto/aspeed/Makefile index 8bc8d4fed5a9..421e2ca9c53e
> > 100644
> > --- a/drivers/crypto/aspeed/Makefile
> > +++ b/drivers/crypto/aspeed/Makefile
> > @@ -1,6 +1,9 @@
> >   obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> > -aspeed_crypto-objs := aspeed-hace.o \
> > -		      $(hace-hash-y)
> > +aspeed_crypto-objs := aspeed-hace.o	\
> > +		      $(hace-hash-y)	\
> > +		      $(hace-crypto-y)
> >
> >   obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> aspeed-hace-hash.o
> >   hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> > aspeed-hace-hash.o
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) +=
> aspeed-hace-crypto.o
> > +hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) :=
> > +aspeed-hace-crypto.o
> > diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c
> > b/drivers/crypto/aspeed/aspeed-hace-crypto.c
> > new file mode 100644
> 
> [...]
> 
> > +
> > +void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	int rc, i;
> > +
> > +	CIPHER_DBG(hace_dev, "\n");
> > +
> > +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
> > +		aspeed_crypto_algs[i].hace_dev = hace_dev;
> > +		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
> > +		if (rc) {
> > +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> > +				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
> > +		}
> > +	}
> > +
> > +	if (hace_dev->version != AST2600_VERSION)
> > +		return;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
> > +		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
> > +		rc =
> crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
> > +		if (rc) {
> > +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> > +				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
> > +		}
> > +	}
> > +}
> > diff --git a/drivers/crypto/aspeed/aspeed-hace.c
> > b/drivers/crypto/aspeed/aspeed-hace.c
> > index 89b1585d72e2..efc0725ebf98 100644
> > --- a/drivers/crypto/aspeed/aspeed-hace.c
> > +++ b/drivers/crypto/aspeed/aspeed-hace.c
> > @@ -32,10 +32,22 @@ void __weak
> aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> >   	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> >   }
> >
> > +/* Weak function for HACE crypto */
> > +void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__); }
> > +
> > +void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__); }
> > +
> 
> aspeed_unregister_hace_crypto_algs() is not implemented in
> aspeed-hace-crypto.c, so those algorithms are not unregistered during unload.
> 
> This was missed because of __weak function.

I missed this part, thanks for pointing out.
I'll add it in next patch.

> >   /* HACE interrupt service routine */
> >   static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> >   {
> >   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev
> *)dev;
> > +	struct aspeed_engine_crypto *crypto_engine =
> > +&hace_dev->crypto_engine;
> >   	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >   	u32 sts;
> >
> > @@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> >   			dev_warn(hace_dev->dev, "HASH no active requests.\n");
> >   	}
> >
> > +	if (sts & HACE_CRYPTO_ISR) {
> > +		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
> > +			tasklet_schedule(&crypto_engine->done_task);
> > +		else
> > +			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
> > +	}
> > +
> >   	return IRQ_HANDLED;
> >   }
> >
> > +static void aspeed_hace_crypto_done_task(unsigned long data) {
> > +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> > +	struct aspeed_engine_crypto *crypto_engine =
> > +&hace_dev->crypto_engine;
> > +
> > +	crypto_engine->resume(hace_dev);
> > +}
> > +
> >   static void aspeed_hace_hash_done_task(unsigned long data)
> >   {
> >   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev
> *)data;
> > @@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned
> long data)
> >   static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
> >   {
> >   	aspeed_register_hace_hash_algs(hace_dev);
> > +	aspeed_register_hace_crypto_algs(hace_dev);
> >   }
> >
> >   static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> >   {
> >   	aspeed_unregister_hace_hash_algs(hace_dev);
> > +	aspeed_unregister_hace_crypto_algs(hace_dev);
> >   }
> 
> Could just wrap these calls instead of weak functions.
> 
> static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> { #ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH
>    	aspeed_unregister_hace_hash_algs(hace_dev);
> #endif
> #ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO
> 	aspeed_unregister_hace_crypto_algs(hace_dev);
> #endif
> }

Okay, I'll revise it as you suggested.

> >
> >   static const struct of_device_id aspeed_hace_of_matches[] = { @@
> > -80,6 +109,7 @@ static const struct of_device_id
> > aspeed_hace_of_matches[] = {
> >
> >   static int aspeed_hace_probe(struct platform_device *pdev)
> >   {
> > +	struct aspeed_engine_crypto *crypto_engine;
> >   	const struct of_device_id *hace_dev_id;
> >   	struct aspeed_engine_hash *hash_engine;
> >   	struct aspeed_hace_dev *hace_dev;
> > @@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device
> *pdev)
> >   	hace_dev->dev = &pdev->dev;
> >   	hace_dev->version = (unsigned long)hace_dev_id->data;
> >   	hash_engine = &hace_dev->hash_engine;
> > +	crypto_engine = &hace_dev->crypto_engine;
> >
> >   	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> 
> Thanks,
> Dhananjay

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
@ 2022-07-27  5:31       ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-27  5:31 UTC (permalink / raw)
  To: Dhananjay Phadke, Corentin Labbe, Christophe JAILLET,
	Randy Dunlap, Herbert Xu, David S . Miller, Rob Herring,
	Krzysztof Kozlowski, Joel Stanley, Andrew Jeffery,
	Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> Hi Neal,
> 
> Thanks for addressing v7 review comments, few more below.
> 
> On 7/26/2022 4:34 AM, Neal Liu wrote:
> > Add HACE crypto driver to support symmetric-key encryption and
> > decryption with multiple modes of operation.
> >
> > Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> > Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> > ---
> >   drivers/crypto/aspeed/Kconfig              |   26 +
> >   drivers/crypto/aspeed/Makefile             |    7 +-
> >   drivers/crypto/aspeed/aspeed-hace-crypto.c | 1121
> ++++++++++++++++++++
> >   drivers/crypto/aspeed/aspeed-hace.c        |   91 +-
> >   drivers/crypto/aspeed/aspeed-hace.h        |  112 ++
> >   5 files changed, 1354 insertions(+), 3 deletions(-)
> >   create mode 100644 drivers/crypto/aspeed/aspeed-hace-crypto.c
> >
> > diff --git a/drivers/crypto/aspeed/Kconfig
> > b/drivers/crypto/aspeed/Kconfig index 059e627efef8..f19994915a5e
> > 100644
> > --- a/drivers/crypto/aspeed/Kconfig
> > +++ b/drivers/crypto/aspeed/Kconfig
> > @@ -30,3 +30,29 @@ config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >   	  to ask for those messages.
> >   	  Avoid enabling this option for production build to
> >   	  minimize driver timing.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_CRYPTO
> > +	bool "Enable Aspeed Hash & Crypto Engine (HACE) crypto"
> > +	depends on CRYPTO_DEV_ASPEED
> > +	select CRYPTO_ENGINE
> > +	select CRYPTO_AES
> > +	select CRYPTO_DES
> > +	select CRYPTO_ECB
> > +	select CRYPTO_CBC
> > +	select CRYPTO_CFB
> > +	select CRYPTO_OFB
> > +	select CRYPTO_CTR
> > +	help
> > +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> > +	  crypto driver.
> > +	  Supports AES/DES symmetric-key encryption and decryption
> > +	  with ECB/CBC/CFB/OFB/CTR options.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_CRYPTO_DEBUG
> > +	bool "Enable HACE crypto debug messages"
> > +	depends on CRYPTO_DEV_ASPEED_HACE_CRYPTO
> > +	help
> > +	  Print HACE crypto debugging messages if you use this option
> > +	  to ask for those messages.
> > +	  Avoid enabling this option for production build to
> > +	  minimize driver timing.
> 
> Why are separate options required for hash and crypto algorithms, if hace is
> only hw crypto on the SoCs?
> 
> Looks like that's requiring unnecessary __weak register / unregister functions
> [see below].
> 
> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
> defconfigs?

I would like to separate different algorithms by different options for more convenient for further use and debug.
We also have RSA engine named ACRY, and would upstream once hash & crypto being accepted.
So combined them into one option seems not a good choice for multiple hw crypto, do you agree?

> > diff --git a/drivers/crypto/aspeed/Makefile
> > b/drivers/crypto/aspeed/Makefile index 8bc8d4fed5a9..421e2ca9c53e
> > 100644
> > --- a/drivers/crypto/aspeed/Makefile
> > +++ b/drivers/crypto/aspeed/Makefile
> > @@ -1,6 +1,9 @@
> >   obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> > -aspeed_crypto-objs := aspeed-hace.o \
> > -		      $(hace-hash-y)
> > +aspeed_crypto-objs := aspeed-hace.o	\
> > +		      $(hace-hash-y)	\
> > +		      $(hace-crypto-y)
> >
> >   obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> aspeed-hace-hash.o
> >   hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> > aspeed-hace-hash.o
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) +=
> aspeed-hace-crypto.o
> > +hace-crypto-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO) :=
> > +aspeed-hace-crypto.o
> > diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c
> > b/drivers/crypto/aspeed/aspeed-hace-crypto.c
> > new file mode 100644
> 
> [...]
> 
> > +
> > +void aspeed_register_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	int rc, i;
> > +
> > +	CIPHER_DBG(hace_dev, "\n");
> > +
> > +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs); i++) {
> > +		aspeed_crypto_algs[i].hace_dev = hace_dev;
> > +		rc = crypto_register_skcipher(&aspeed_crypto_algs[i].alg.skcipher);
> > +		if (rc) {
> > +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> > +				   aspeed_crypto_algs[i].alg.skcipher.base.cra_name);
> > +		}
> > +	}
> > +
> > +	if (hace_dev->version != AST2600_VERSION)
> > +		return;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(aspeed_crypto_algs_g6); i++) {
> > +		aspeed_crypto_algs_g6[i].hace_dev = hace_dev;
> > +		rc =
> crypto_register_skcipher(&aspeed_crypto_algs_g6[i].alg.skcipher);
> > +		if (rc) {
> > +			CIPHER_DBG(hace_dev, "Failed to register %s\n",
> > +				   aspeed_crypto_algs_g6[i].alg.skcipher.base.cra_name);
> > +		}
> > +	}
> > +}
> > diff --git a/drivers/crypto/aspeed/aspeed-hace.c
> > b/drivers/crypto/aspeed/aspeed-hace.c
> > index 89b1585d72e2..efc0725ebf98 100644
> > --- a/drivers/crypto/aspeed/aspeed-hace.c
> > +++ b/drivers/crypto/aspeed/aspeed-hace.c
> > @@ -32,10 +32,22 @@ void __weak
> aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> >   	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> >   }
> >
> > +/* Weak function for HACE crypto */
> > +void __weak aspeed_register_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__); }
> > +
> > +void __weak aspeed_unregister_hace_crypto_algs(struct aspeed_hace_dev
> > +*hace_dev) {
> > +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__); }
> > +
> 
> aspeed_unregister_hace_crypto_algs() is not implemented in
> aspeed-hace-crypto.c, so those algorithms are not unregistered during unload.
> 
> This was missed because of __weak function.

I missed this part, thanks for pointing out.
I'll add it in next patch.

> >   /* HACE interrupt service routine */
> >   static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> >   {
> >   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev
> *)dev;
> > +	struct aspeed_engine_crypto *crypto_engine =
> > +&hace_dev->crypto_engine;
> >   	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >   	u32 sts;
> >
> > @@ -51,9 +63,24 @@ static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> >   			dev_warn(hace_dev->dev, "HASH no active requests.\n");
> >   	}
> >
> > +	if (sts & HACE_CRYPTO_ISR) {
> > +		if (crypto_engine->flags & CRYPTO_FLAGS_BUSY)
> > +			tasklet_schedule(&crypto_engine->done_task);
> > +		else
> > +			dev_warn(hace_dev->dev, "CRYPTO no active requests.\n");
> > +	}
> > +
> >   	return IRQ_HANDLED;
> >   }
> >
> > +static void aspeed_hace_crypto_done_task(unsigned long data) {
> > +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> > +	struct aspeed_engine_crypto *crypto_engine =
> > +&hace_dev->crypto_engine;
> > +
> > +	crypto_engine->resume(hace_dev);
> > +}
> > +
> >   static void aspeed_hace_hash_done_task(unsigned long data)
> >   {
> >   	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev
> *)data;
> > @@ -65,11 +92,13 @@ static void aspeed_hace_hash_done_task(unsigned
> long data)
> >   static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
> >   {
> >   	aspeed_register_hace_hash_algs(hace_dev);
> > +	aspeed_register_hace_crypto_algs(hace_dev);
> >   }
> >
> >   static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> >   {
> >   	aspeed_unregister_hace_hash_algs(hace_dev);
> > +	aspeed_unregister_hace_crypto_algs(hace_dev);
> >   }
> 
> Could just wrap these calls instead of weak functions.
> 
> static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> { #ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH
>    	aspeed_unregister_hace_hash_algs(hace_dev);
> #endif
> #ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO
> 	aspeed_unregister_hace_crypto_algs(hace_dev);
> #endif
> }

Okay, I'll revise it as you suggested.

> >
> >   static const struct of_device_id aspeed_hace_of_matches[] = { @@
> > -80,6 +109,7 @@ static const struct of_device_id
> > aspeed_hace_of_matches[] = {
> >
> >   static int aspeed_hace_probe(struct platform_device *pdev)
> >   {
> > +	struct aspeed_engine_crypto *crypto_engine;
> >   	const struct of_device_id *hace_dev_id;
> >   	struct aspeed_engine_hash *hash_engine;
> >   	struct aspeed_hace_dev *hace_dev;
> > @@ -100,6 +130,7 @@ static int aspeed_hace_probe(struct platform_device
> *pdev)
> >   	hace_dev->dev = &pdev->dev;
> >   	hace_dev->version = (unsigned long)hace_dev_id->data;
> >   	hash_engine = &hace_dev->hash_engine;
> > +	crypto_engine = &hace_dev->crypto_engine;
> >
> >   	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> 
> Thanks,
> Dhananjay
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
  2022-07-27  5:31       ` Neal Liu
@ 2022-07-28  6:18         ` Dhananjay Phadke
  -1 siblings, 0 replies; 32+ messages in thread
From: Dhananjay Phadke @ 2022-07-28  6:18 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 7/26/2022 10:31 PM, Neal Liu wrote:
>> Why are separate options required for hash and crypto algorithms, if hace is
>> only hw crypto on the SoCs?
>>
>> Looks like that's requiring unnecessary __weak register / unregister functions
>> [see below].
>>
>> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
>> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
>> defconfigs?
> I would like to separate different algorithms by different options for more convenient for further use and debug.
> We also have RSA engine named ACRY, and would upstream once hash & crypto being accepted.
> So combined them into one option seems not a good choice for multiple hw crypto, do you agree?

Not sure what's the use case of just enabling crypto or hash separately
out of same HW engine and esp. when there's no alternative accel 
available, but that's fine.

If ARCY is different HW engine (interface) then having separate config
sounds logical.

Multiplying DEBUG configs seems unnecessary though. With dynamic debug
any of the dev_dbg could be turned on. Suggest using single one for
the module, if not drop it altogether. Following code is still not
covered by Kconfig, it in common code.

 > +#ifdef ASPEED_HACE_DEBUG
 > +#define HACE_DBG(d, fmt, ...)	\
 > +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
 > +#else
 > +#define HACE_DBG(d, fmt, ...)	\
 > +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
 > +#endif

Regards,
Dhananjay



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
@ 2022-07-28  6:18         ` Dhananjay Phadke
  0 siblings, 0 replies; 32+ messages in thread
From: Dhananjay Phadke @ 2022-07-28  6:18 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 7/26/2022 10:31 PM, Neal Liu wrote:
>> Why are separate options required for hash and crypto algorithms, if hace is
>> only hw crypto on the SoCs?
>>
>> Looks like that's requiring unnecessary __weak register / unregister functions
>> [see below].
>>
>> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
>> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
>> defconfigs?
> I would like to separate different algorithms by different options for more convenient for further use and debug.
> We also have RSA engine named ACRY, and would upstream once hash & crypto being accepted.
> So combined them into one option seems not a good choice for multiple hw crypto, do you agree?

Not sure what's the use case of just enabling crypto or hash separately
out of same HW engine and esp. when there's no alternative accel 
available, but that's fine.

If ARCY is different HW engine (interface) then having separate config
sounds logical.

Multiplying DEBUG configs seems unnecessary though. With dynamic debug
any of the dev_dbg could be turned on. Suggest using single one for
the module, if not drop it altogether. Following code is still not
covered by Kconfig, it in common code.

 > +#ifdef ASPEED_HACE_DEBUG
 > +#define HACE_DBG(d, fmt, ...)	\
 > +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
 > +#else
 > +#define HACE_DBG(d, fmt, ...)	\
 > +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
 > +#endif

Regards,
Dhananjay



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
  2022-07-28  6:18         ` Dhananjay Phadke
@ 2022-07-28  8:58           ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-28  8:58 UTC (permalink / raw)
  To: Dhananjay Phadke, Corentin Labbe, Christophe JAILLET,
	Randy Dunlap, Herbert Xu, David S . Miller, Rob Herring,
	Krzysztof Kozlowski, Joel Stanley, Andrew Jeffery, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> On 7/26/2022 10:31 PM, Neal Liu wrote:
> >> Why are separate options required for hash and crypto algorithms, if
> >> hace is only hw crypto on the SoCs?
> >>
> >> Looks like that's requiring unnecessary __weak register / unregister
> >> functions [see below].
> >>
> >> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
> >> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
> >> defconfigs?
> > I would like to separate different algorithms by different options for more
> convenient for further use and debug.
> > We also have RSA engine named ACRY, and would upstream once hash &
> crypto being accepted.
> > So combined them into one option seems not a good choice for multiple hw
> crypto, do you agree?
> 
> Not sure what's the use case of just enabling crypto or hash separately out of
> same HW engine and esp. when there's no alternative accel available, but
> that's fine.
> 
> If ARCY is different HW engine (interface) then having separate config sounds
> logical.
> 
> Multiplying DEBUG configs seems unnecessary though. With dynamic debug
> any of the dev_dbg could be turned on. Suggest using single one for the
> module, if not drop it altogether. Following code is still not covered by Kconfig,
> it in common code.
> 
>  > +#ifdef ASPEED_HACE_DEBUG
>  > +#define HACE_DBG(d, fmt, ...)	\
>  > +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>  > +#else
>  > +#define HACE_DBG(d, fmt, ...)	\
>  > +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>  > +#endif
> 
> Regards,
> Dhananjay

Okay, I'll leverage your suggestion with different crypto algos options and 1 debug option for all modules.
Thanks !


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver
@ 2022-07-28  8:58           ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-07-28  8:58 UTC (permalink / raw)
  To: Dhananjay Phadke, Corentin Labbe, Christophe JAILLET,
	Randy Dunlap, Herbert Xu, David S . Miller, Rob Herring,
	Krzysztof Kozlowski, Joel Stanley, Andrew Jeffery, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> On 7/26/2022 10:31 PM, Neal Liu wrote:
> >> Why are separate options required for hash and crypto algorithms, if
> >> hace is only hw crypto on the SoCs?
> >>
> >> Looks like that's requiring unnecessary __weak register / unregister
> >> functions [see below].
> >>
> >> Couldn't just two options CONFIG_CRYPTO_DEV_ASPEED and
> >> CONFIG_CRYPTO_DEV_ASPEED_DEBUG be simpler to set for downstream
> >> defconfigs?
> > I would like to separate different algorithms by different options for more
> convenient for further use and debug.
> > We also have RSA engine named ACRY, and would upstream once hash &
> crypto being accepted.
> > So combined them into one option seems not a good choice for multiple hw
> crypto, do you agree?
> 
> Not sure what's the use case of just enabling crypto or hash separately out of
> same HW engine and esp. when there's no alternative accel available, but
> that's fine.
> 
> If ARCY is different HW engine (interface) then having separate config sounds
> logical.
> 
> Multiplying DEBUG configs seems unnecessary though. With dynamic debug
> any of the dev_dbg could be turned on. Suggest using single one for the
> module, if not drop it altogether. Following code is still not covered by Kconfig,
> it in common code.
> 
>  > +#ifdef ASPEED_HACE_DEBUG
>  > +#define HACE_DBG(d, fmt, ...)	\
>  > +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>  > +#else
>  > +#define HACE_DBG(d, fmt, ...)	\
>  > +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>  > +#endif
> 
> Regards,
> Dhananjay

Okay, I'll leverage your suggestion with different crypto algos options and 1 debug option for all modules.
Thanks !

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-07-26 11:34   ` Neal Liu
@ 2022-08-08  2:53     ` liulongfang
  -1 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-08  2:53 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW


On 2022/7/26 19:34, Neal Liu wrote:
> Hash and Crypto Engine (HACE) is designed to accelerate the
> throughput of hash data digest, encryption, and decryption.
> 
> Basically, HACE can be divided into two independently engines
> - Hash Engine and Crypto Engine. This patch aims to add HACE
> hash engine driver for hash accelerator.
> 
> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> ---
>  MAINTAINERS                              |    7 +
>  drivers/crypto/Kconfig                   |    1 +
>  drivers/crypto/Makefile                  |    1 +
>  drivers/crypto/aspeed/Kconfig            |   32 +
>  drivers/crypto/aspeed/Makefile           |    6 +
>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389 ++++++++++++++++++++++
>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>  8 files changed, 1835 insertions(+)
>  create mode 100644 drivers/crypto/aspeed/Kconfig
>  create mode 100644 drivers/crypto/aspeed/Makefile
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f55aea311af5..23a0215b7e42 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3140,6 +3140,13 @@ S:	Maintained
>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>  F:	drivers/media/platform/aspeed/
>  
> +ASPEED CRYPTO DRIVER
> +M:	Neal Liu <neal_liu@aspeedtech.com>
> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> +S:	Maintained
> +F:	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> +F:	drivers/crypto/aspeed/
> +
>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>  M:	Corentin Chary <corentin.chary@gmail.com>
>  L:	acpi4asus-user@lists.sourceforge.net
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index ee99c02c84e8..b9f5ee126881 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>  	  acceleration for cryptographic algorithms on these devices.
>  
>  source "drivers/crypto/keembay/Kconfig"
> +source "drivers/crypto/aspeed/Kconfig"
>  
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index f81703a86b98..116de173a66c 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> new file mode 100644
> index 000000000000..059e627efef8
> --- /dev/null
> +++ b/drivers/crypto/aspeed/Kconfig
> @@ -0,0 +1,32 @@
> +config CRYPTO_DEV_ASPEED
> +	tristate "Support for Aspeed cryptographic engine driver"
> +	depends on ARCH_ASPEED
> +	help
> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> +	  throughput of hash data digest, encryption and decryption.
> +
> +	  Select y here to have support for the cryptographic driver
> +	  available on Aspeed SoC.
> +
> +config CRYPTO_DEV_ASPEED_HACE_HASH
> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> +	depends on CRYPTO_DEV_ASPEED
> +	select CRYPTO_ENGINE
> +	select CRYPTO_SHA1
> +	select CRYPTO_SHA256
> +	select CRYPTO_SHA512
> +	select CRYPTO_HMAC
> +	help
> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> +	  hash driver.
> +	  Supports multiple message digest standards, including
> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> +
> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> +	bool "Enable HACE hash debug messages"
> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> +	help
> +	  Print HACE hash debugging messages if you use this option
> +	  to ask for those messages.
> +	  Avoid enabling this option for production build to
> +	  minimize driver timing.
> diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
> new file mode 100644
> index 000000000000..8bc8d4fed5a9
> --- /dev/null
> +++ b/drivers/crypto/aspeed/Makefile
> @@ -0,0 +1,6 @@
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> +aspeed_crypto-objs := aspeed-hace.o \
> +		      $(hace-hash-y)
> +
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c b/drivers/crypto/aspeed/aspeed-hace-hash.c
> new file mode 100644
> index 000000000000..63a8ad694996
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> @@ -0,0 +1,1389 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (c) 2021 Aspeed Technology Inc.
> + */
> +
> +#include "aspeed-hace.h"
> +
> +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> +#define AHASH_DBG(h, fmt, ...)	\
> +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#else
> +#define AHASH_DBG(h, fmt, ...)	\
> +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#endif
> +
> +/* Initialization Vectors for SHA-family */
> +static const __be32 sha1_iv[8] = {
> +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
> +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
> +	cpu_to_be32(SHA1_H4), 0, 0, 0
> +};
> +
> +static const __be32 sha224_iv[8] = {
> +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
> +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
> +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
> +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
> +};
> +
> +static const __be32 sha256_iv[8] = {
> +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
> +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
> +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
> +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
> +};
> +
> +static const __be64 sha384_iv[8] = {
> +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
> +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
> +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
> +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
> +};
> +
> +static const __be64 sha512_iv[8] = {
> +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
> +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
> +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
> +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
> +};
> +
> +static const __be32 sha512_224_iv[16] = {
> +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
> +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
> +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
> +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
> +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
> +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
> +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
> +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
> +};
> +
> +static const __be32 sha512_256_iv[16] = {
> +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
> +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
> +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
> +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
> +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
> +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
> +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
> +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
> +};
> +
> +/* The purpose of this padding is to ensure that the padded message is a
> + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits (SHA384/SHA512).
> + * The bit "1" is appended at the end of the message followed by
> + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
> + * 128 bits block (SHA384/SHA512) equals to the message length in bits
> + * is appended.
> + *
> + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
> + *  - if message length < 56 bytes then padlen = 56 - message length
> + *  - else padlen = 64 + 56 - message length
> + *
> + * For SHA384/SHA512, padlen is calculated as followed:
> + *  - if message length < 112 bytes then padlen = 112 - message length
> + *  - else padlen = 128 + 112 - message length
> + */
> +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
> +				      struct aspeed_sham_reqctx *rctx)
> +{
> +	unsigned int index, padlen;
> +	__be64 bits[2];
> +
> +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
> +
> +	switch (rctx->flags & SHA_FLAGS_MASK) {
> +	case SHA_FLAGS_SHA1:
> +	case SHA_FLAGS_SHA224:
> +	case SHA_FLAGS_SHA256:
> +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
> +		index = rctx->bufcnt & 0x3f;
> +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
> +		rctx->bufcnt += padlen + 8;
> +		break;
> +	default:
> +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
> +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
> +				      rctx->digcnt[0] >> 61);
> +		index = rctx->bufcnt & 0x7f;
> +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
> +		rctx->bufcnt += padlen + 16;
> +		break;
> +	}
> +}
> +
> +/*
> + * Prepare DMA buffer before hardware engine
> + * processing.
> + */
> +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	int length, remain;
> +
> +	length = rctx->total + rctx->bufcnt;
> +	remain = length % rctx->block_size;
> +
> +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
> +
> +	if (rctx->bufcnt)
> +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
> +
> +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
> +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
> +					 rctx->bufcnt, rctx->src_sg,
> +					 rctx->offset, rctx->total - remain, 0);
> +		rctx->offset += rctx->total - remain;
> +
> +	} else {
> +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
> +		return -EINVAL;
> +	}
> +
> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
> +				 rctx->offset, remain, 0);
> +
> +	rctx->bufcnt = remain;
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		return -ENOMEM;
> +	}
> +
> +	hash_engine->src_length = length - remain;
> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	return 0;
> +}
> +
> +/*
> + * Prepare DMA buffer as SG list buffer before
> + * hardware engine processing.
> + */
> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct aspeed_sg_list *src_list;
> +	struct scatterlist *s;
> +	int length, remain, sg_len, i;
> +	int rc = 0;
> +
> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> +	length = rctx->total + rctx->bufcnt - remain;
> +
> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> +		  "length", length, "remain", remain);
> +
> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +			    DMA_TO_DEVICE);
> +	if (!sg_len) {
> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto free_src_sg;
> +	}
> +
> +	if (rctx->bufcnt != 0) {
> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> +						       rctx->buffer,
> +						       rctx->block_size * 2,
> +						       DMA_TO_DEVICE);
> +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +			rc = -ENOMEM;
> +			goto free_rctx_digest;
> +		}
> +
> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> +		src_list[0].len = rctx->bufcnt;
> +		length -= src_list[0].len;
> +
> +		/* Last sg list */
> +		if (length == 0)
> +			src_list[0].len |= HASH_SG_LAST_LIST;
> +
> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> +		src_list++;
> +	}
> +
> +	if (length != 0) {
> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> +			src_list[i].phy_addr = sg_dma_address(s);
> +
> +			if (length > sg_dma_len(s)) {
> +				src_list[i].len = sg_dma_len(s);
> +				length -= sg_dma_len(s);
> +
> +			} else {
> +				/* Last sg list */
> +				src_list[i].len = length;
> +				src_list[i].len |= HASH_SG_LAST_LIST;
> +				length = 0;
> +			}
> +
> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> +		}
> +	}
> +
> +	if (length != 0) {
> +		rc = -EINVAL;
> +		goto free_rctx_buffer;
> +	}
> +
> +	rctx->offset = rctx->total - remain;
> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	goto end;
Exiting via "goto xx" is not recommended in normal code logic (this requires two jumps),
exiting via "return 0" is more efficient.
This code method has many times in your entire driver, it is recommended to modify it.
> +
> +free_rctx_buffer:
> +	if (rctx->bufcnt != 0)
> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +free_src_sg:
> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +		     DMA_TO_DEVICE);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> +
> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> +
> +	return 0;
> +}
> +
> +/*
> + * Copy digest to the corresponding request result.
> + * This function will be called at final() stage.
> + */
> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> +
> +	memcpy(req->result, rctx->digest, rctx->digsize);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +/*
> + * Trigger hardware engines to do the math.
> + */
> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
> +				     aspeed_hace_fn_t resume)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x, length:0x%x\n",
> +		  hash_engine->src_dma, hash_engine->digest_dma,
> +		  hash_engine->src_length);
> +
> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> +	hash_engine->resume = resume;
> +
> +	ast_hace_write(hace_dev, hash_engine->src_dma, ASPEED_HACE_HASH_SRC);
> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> +		       ASPEED_HACE_HASH_KEY_BUFF);
> +	ast_hace_write(hace_dev, hash_engine->src_length,
> +		       ASPEED_HACE_HASH_DATA_LEN);
> +
> +	/* Memory barrier to ensure all data setup before engine starts */
> +	mb();
> +
> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
A hardware service sending requires 5 hardware commands to complete.
In a multi-concurrency scenario, how to ensure the order of commands?
(If two processes send hardware task at the same time,
How to ensure that the hardware recognizes which task the current
command belongs to?)
> +
> +	return -EINPROGRESS;
> +}
> +
> +/*
> + * HMAC resume aims to do the second pass produces
> + * the final HMAC code derived from the inner hash
> + * result and the outer key.
> + */
> +static int aspeed_ahash_hmac_resume(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +	int rc = 0;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> +
> +	/* o key pad + hash sum 1 */
> +	memcpy(rctx->buffer, bctx->opad, rctx->block_size);
> +	memcpy(rctx->buffer + rctx->block_size, rctx->digest, rctx->digsize);
> +
> +	rctx->bufcnt = rctx->block_size + rctx->digsize;
> +	rctx->digcnt[0] = rctx->block_size + rctx->digsize;
> +
> +	aspeed_ahash_fill_padding(hace_dev, rctx);
> +	memcpy(rctx->digest, rctx->sha_iv, rctx->ivsize);
> +
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev, rctx->buffer,
> +					       rctx->block_size * 2,
> +					       DMA_TO_DEVICE);
> +	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +		rc = -ENOMEM;
> +		goto free_rctx_digest;
> +	}
> +
> +	hash_engine->src_dma = rctx->buffer_dma_addr;
> +	hash_engine->src_length = rctx->bufcnt;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
> +
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_req_final(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	int rc = 0;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	aspeed_ahash_fill_padding(hace_dev, rctx);
> +
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> +					       rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> +					       rctx->buffer,
> +					       rctx->block_size * 2,
> +					       DMA_TO_DEVICE);
> +	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +		rc = -ENOMEM;
> +		goto free_rctx_digest;
> +	}
> +
> +	hash_engine->src_dma = rctx->buffer_dma_addr;
> +	hash_engine->src_length = rctx->bufcnt;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	if (rctx->flags & SHA_FLAGS_HMAC)
> +		return aspeed_hace_ahash_trigger(hace_dev,
> +						 aspeed_ahash_hmac_resume);
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
> +
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_update_resume_sg(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +		     DMA_TO_DEVICE);
> +
> +	if (rctx->bufcnt != 0)
> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +				 rctx->block_size * 2,
> +				 DMA_TO_DEVICE);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg, rctx->offset,
> +				 rctx->total - rctx->offset, 0);
> +
> +	rctx->bufcnt = rctx->total - rctx->offset;
> +	rctx->cmd &= ~HASH_CMD_HASH_SRC_SG_CTRL;
> +
> +	if (rctx->flags & SHA_FLAGS_FINUP)
> +		return aspeed_ahash_req_final(hace_dev);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +static int aspeed_ahash_update_resume(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	if (rctx->flags & SHA_FLAGS_FINUP)
> +		return aspeed_ahash_req_final(hace_dev);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +static int aspeed_ahash_req_update(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	aspeed_hace_fn_t resume;
> +	int ret;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	if (hace_dev->version == AST2600_VERSION) {
> +		rctx->cmd |= HASH_CMD_HASH_SRC_SG_CTRL;
> +		resume = aspeed_ahash_update_resume_sg;
> +
> +	} else {
> +		resume = aspeed_ahash_update_resume;
> +	}
> +
> +	ret = hash_engine->dma_prepare(hace_dev);
> +	if (ret)
> +		return ret;
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, resume);
> +}
> +
> +static int aspeed_hace_hash_handle_queue(struct aspeed_hace_dev *hace_dev,
> +				  struct ahash_request *req)
> +{
> +	return crypto_transfer_hash_request_to_engine(
> +			hace_dev->crypt_engine_hash, req);
> +}
> +
> +static int aspeed_ahash_do_request(struct crypto_engine *engine, void *areq)
> +{
> +	struct ahash_request *req = ahash_request_cast(areq);
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_engine_hash *hash_engine;
> +	int ret = 0;
> +
> +	hash_engine = &hace_dev->hash_engine;
> +	hash_engine->flags |= CRYPTO_FLAGS_BUSY;
> +
> +	if (rctx->op == SHA_OP_UPDATE)
> +		ret = aspeed_ahash_req_update(hace_dev);
> +	else if (rctx->op == SHA_OP_FINAL)
> +		ret = aspeed_ahash_req_final(hace_dev);
> +
> +	if (ret != -EINPROGRESS)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int aspeed_ahash_prepare_request(struct crypto_engine *engine,
> +					void *areq)
> +{
> +	struct ahash_request *req = ahash_request_cast(areq);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_engine_hash *hash_engine;
> +
> +	hash_engine = &hace_dev->hash_engine;
> +	hash_engine->req = req;
> +
> +	if (hace_dev->version == AST2600_VERSION)
> +		hash_engine->dma_prepare = aspeed_ahash_dma_prepare_sg;
> +	else
> +		hash_engine->dma_prepare = aspeed_ahash_dma_prepare;
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_update(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
> +
> +	rctx->total = req->nbytes;
> +	rctx->src_sg = req->src;
> +	rctx->offset = 0;
> +	rctx->src_nents = sg_nents(req->src);
> +	rctx->op = SHA_OP_UPDATE;
> +
> +	rctx->digcnt[0] += rctx->total;
> +	if (rctx->digcnt[0] < rctx->total)
> +		rctx->digcnt[1]++;
> +
> +	if (rctx->bufcnt + rctx->total < rctx->block_size) {
> +		scatterwalk_map_and_copy(rctx->buffer + rctx->bufcnt,
> +					 rctx->src_sg, rctx->offset,
> +					 rctx->total, 0);
> +		rctx->bufcnt += rctx->total;
> +
> +		return 0;
> +	}
> +
> +	return aspeed_hace_hash_handle_queue(hace_dev, req);
> +}
> +
> +static int aspeed_sham_shash_digest(struct crypto_shash *tfm, u32 flags,
> +				    const u8 *data, unsigned int len, u8 *out)
> +{
> +	SHASH_DESC_ON_STACK(shash, tfm);
> +
> +	shash->tfm = tfm;
> +
> +	return crypto_shash_digest(shash, data, len, out);
> +}
> +
> +static int aspeed_sham_final(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes:%d, rctx->total:%d\n",
> +		  req->nbytes, rctx->total);
> +	rctx->op = SHA_OP_FINAL;
> +
> +	return aspeed_hace_hash_handle_queue(hace_dev, req);
> +}
> +
> +static int aspeed_sham_finup(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	int rc1, rc2;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
> +
> +	rctx->flags |= SHA_FLAGS_FINUP;
> +
> +	rc1 = aspeed_sham_update(req);
> +	if (rc1 == -EINPROGRESS || rc1 == -EBUSY)
> +		return rc1;
> +
> +	/*
> +	 * final() has to be always called to cleanup resources
> +	 * even if update() failed, except EINPROGRESS
> +	 */
> +	rc2 = aspeed_sham_final(req);
> +
> +	return rc1 ? : rc2;
> +}
> +
> +static int aspeed_sham_init(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +	AHASH_DBG(hace_dev, "%s: digest size:%d\n",
> +		  crypto_tfm_alg_name(&tfm->base),
> +		  crypto_ahash_digestsize(tfm));
> +
> +	rctx->cmd = HASH_CMD_ACC_MODE;
> +	rctx->flags = 0;
> +
> +	switch (crypto_ahash_digestsize(tfm)) {
> +	case SHA1_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA1 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA1;
> +		rctx->digsize = SHA1_DIGEST_SIZE;
> +		rctx->block_size = SHA1_BLOCK_SIZE;
> +		rctx->sha_iv = sha1_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha1_iv, rctx->ivsize);
> +		break;
> +	case SHA224_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA224 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA224;
> +		rctx->digsize = SHA224_DIGEST_SIZE;
> +		rctx->block_size = SHA224_BLOCK_SIZE;
> +		rctx->sha_iv = sha224_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha224_iv, rctx->ivsize);
> +		break;
> +	case SHA256_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA256 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA256;
> +		rctx->digsize = SHA256_DIGEST_SIZE;
> +		rctx->block_size = SHA256_BLOCK_SIZE;
> +		rctx->sha_iv = sha256_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha256_iv, rctx->ivsize);
> +		break;
> +	case SHA384_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA384 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA384;
> +		rctx->digsize = SHA384_DIGEST_SIZE;
> +		rctx->block_size = SHA384_BLOCK_SIZE;
> +		rctx->sha_iv = (const __be32 *)sha384_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha384_iv, rctx->ivsize);
> +		break;
> +	case SHA512_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512;
> +		rctx->digsize = SHA512_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = (const __be32 *)sha512_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_iv, rctx->ivsize);
> +		break;
> +	default:
> +		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
> +			 crypto_ahash_digestsize(tfm));
> +		return -EINVAL;
> +	}
> +
> +	rctx->bufcnt = 0;
> +	rctx->total = 0;
> +	rctx->digcnt[0] = 0;
> +	rctx->digcnt[1] = 0;
> +
> +	/* HMAC init */
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		rctx->digcnt[0] = rctx->block_size;
> +		rctx->bufcnt = rctx->block_size;
> +		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
> +		rctx->flags |= SHA_FLAGS_HMAC;
> +	}
> +
> +	return 0;
> +}
> +
> +static int aspeed_sha512s_init(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +	AHASH_DBG(hace_dev, "digest size: %d\n", crypto_ahash_digestsize(tfm));
> +
> +	rctx->cmd = HASH_CMD_ACC_MODE;
> +	rctx->flags = 0;
> +
> +	switch (crypto_ahash_digestsize(tfm)) {
> +	case SHA224_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_224 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512_224;
> +		rctx->digsize = SHA224_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = sha512_224_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_224_iv, rctx->ivsize);
> +		break;
> +	case SHA256_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_256 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512_256;
> +		rctx->digsize = SHA256_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = sha512_256_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_256_iv, rctx->ivsize);
> +		break;
> +	default:
> +		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
> +			 crypto_ahash_digestsize(tfm));
> +		return -EINVAL;
> +	}
> +
> +	rctx->bufcnt = 0;
> +	rctx->total = 0;
> +	rctx->digcnt[0] = 0;
> +	rctx->digcnt[1] = 0;
> +
> +	/* HMAC init */
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		rctx->digcnt[0] = rctx->block_size;
> +		rctx->bufcnt = rctx->block_size;
> +		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
> +		rctx->flags |= SHA_FLAGS_HMAC;
> +	}
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_digest(struct ahash_request *req)
> +{
> +	return aspeed_sham_init(req) ? : aspeed_sham_finup(req);
> +}
> +
> +static int aspeed_sham_setkey(struct crypto_ahash *tfm, const u8 *key,
> +			      unsigned int keylen)
> +{
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +	int ds = crypto_shash_digestsize(bctx->shash);
> +	int bs = crypto_shash_blocksize(bctx->shash);
> +	int err = 0;
> +	int i;
> +
> +	AHASH_DBG(hace_dev, "%s: keylen:%d\n", crypto_tfm_alg_name(&tfm->base),
> +		  keylen);
> +
> +	if (keylen > bs) {
> +		err = aspeed_sham_shash_digest(bctx->shash,
> +					       crypto_shash_get_flags(bctx->shash),
> +					       key, keylen, bctx->ipad);
> +		if (err)
> +			return err;
> +		keylen = ds;
> +
> +	} else {
> +		memcpy(bctx->ipad, key, keylen);
> +	}
> +
> +	memset(bctx->ipad + keylen, 0, bs - keylen);
> +	memcpy(bctx->opad, bctx->ipad, bs);
> +
> +	for (i = 0; i < bs; i++) {
> +		bctx->ipad[i] ^= HMAC_IPAD_VALUE;
> +		bctx->opad[i] ^= HMAC_OPAD_VALUE;
> +	}
> +
> +	return err;
> +}
> +
> +static int aspeed_sham_cra_init(struct crypto_tfm *tfm)
> +{
> +	struct ahash_alg *alg = __crypto_ahash_alg(tfm->__crt_alg);
> +	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> +	struct aspeed_hace_alg *ast_alg;
> +
> +	ast_alg = container_of(alg, struct aspeed_hace_alg, alg.ahash);
> +	tctx->hace_dev = ast_alg->hace_dev;
> +	tctx->flags = 0;
> +
> +	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
> +				 sizeof(struct aspeed_sham_reqctx));
> +
> +	if (ast_alg->alg_base) {
> +		/* hmac related */
> +		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +		tctx->flags |= SHA_FLAGS_HMAC;
> +		bctx->shash = crypto_alloc_shash(ast_alg->alg_base, 0,
> +						 CRYPTO_ALG_NEED_FALLBACK);
> +		if (IS_ERR(bctx->shash)) {
> +			dev_warn(ast_alg->hace_dev->dev,
> +				 "base driver '%s' could not be loaded.\n",
> +				 ast_alg->alg_base);
> +			return PTR_ERR(bctx->shash);
> +		}
> +	}
> +
> +	tctx->enginectx.op.do_one_request = aspeed_ahash_do_request;
> +	tctx->enginectx.op.prepare_request = aspeed_ahash_prepare_request;
> +	tctx->enginectx.op.unprepare_request = NULL;
> +
> +	return 0;
> +}
> +
> +static void aspeed_sham_cra_exit(struct crypto_tfm *tfm)
> +{
> +	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(tfm));
> +
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +		crypto_free_shash(bctx->shash);
> +	}
> +}
> +
> +static int aspeed_sham_export(struct ahash_request *req, void *out)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	memcpy(out, rctx, sizeof(*rctx));
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_import(struct ahash_request *req, const void *in)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	memcpy(rctx, in, sizeof(*rctx));
> +
> +	return 0;
> +}
> +
> +struct aspeed_hace_alg aspeed_ahash_algs[] = {
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA1_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha1",
> +					.cra_driver_name	= "aspeed-sha1",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA1_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha256",
> +					.cra_driver_name	= "aspeed-sha256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA256_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha224",
> +					.cra_driver_name	= "aspeed-sha224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA224_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha1",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA1_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha1)",
> +					.cra_driver_name	= "aspeed-hmac-sha1",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA1_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha224",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha224)",
> +					.cra_driver_name	= "aspeed-hmac-sha224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA224_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha256",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha256)",
> +					.cra_driver_name	= "aspeed-hmac-sha256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA256_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +};
> +
> +struct aspeed_hace_alg aspeed_ahash_algs_g6[] = {
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA384_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha384",
> +					.cra_driver_name	= "aspeed-sha384",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA384_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA512_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512",
> +					.cra_driver_name	= "aspeed-sha512",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512_224",
> +					.cra_driver_name	= "aspeed-sha512_224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512_256",
> +					.cra_driver_name	= "aspeed-sha512_256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha384",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA384_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha384)",
> +					.cra_driver_name	= "aspeed-hmac-sha384",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA384_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA512_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512)",
> +					.cra_driver_name	= "aspeed-hmac-sha512",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512_224",
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512_224)",
> +					.cra_driver_name	= "aspeed-hmac-sha512_224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512_256",
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512_256)",
> +					.cra_driver_name	= "aspeed-hmac-sha512_256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +};
> +
> +void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++)
> +		crypto_unregister_ahash(&aspeed_ahash_algs[i].alg.ahash);
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++)
> +		crypto_unregister_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
> +}
> +
> +void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int rc, i;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++) {
> +		aspeed_ahash_algs[i].hace_dev = hace_dev;
> +		rc = crypto_register_ahash(&aspeed_ahash_algs[i].alg.ahash);
> +		if (rc) {
> +			AHASH_DBG(hace_dev, "Failed to register %s\n",
> +				  aspeed_ahash_algs[i].alg.ahash.halg.base.cra_name);
> +		}
> +	}
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++) {
> +		aspeed_ahash_algs_g6[i].hace_dev = hace_dev;
> +		rc = crypto_register_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
> +		if (rc) {
> +			AHASH_DBG(hace_dev, "Failed to register %s\n",
> +				  aspeed_ahash_algs_g6[i].alg.ahash.halg.base.cra_name);
> +		}
> +	}
> +}
> diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
> new file mode 100644
> index 000000000000..89b1585d72e2
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace.c
> @@ -0,0 +1,213 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (c) 2021 Aspeed Technology Inc.
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/of_irq.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +
> +#include "aspeed-hace.h"
> +
> +#ifdef ASPEED_HACE_DEBUG
> +#define HACE_DBG(d, fmt, ...)	\
> +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#else
> +#define HACE_DBG(d, fmt, ...)	\
> +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#endif
> +
> +/* Weak function for HACE hash */
> +void __weak aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +/* HACE interrupt service routine */
> +static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	u32 sts;
> +
> +	sts = ast_hace_read(hace_dev, ASPEED_HACE_STS);
> +	ast_hace_write(hace_dev, sts, ASPEED_HACE_STS);
> +
> +	HACE_DBG(hace_dev, "irq status: 0x%x\n", sts);
> +
> +	if (sts & HACE_HASH_ISR) {
> +		if (hash_engine->flags & CRYPTO_FLAGS_BUSY)
> +			tasklet_schedule(&hash_engine->done_task);
> +		else
> +			dev_warn(hace_dev->dev, "HASH no active requests.\n");
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void aspeed_hace_hash_done_task(unsigned long data)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +
> +	hash_engine->resume(hace_dev);
> +}
> +
> +static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
> +{
> +	aspeed_register_hace_hash_algs(hace_dev);
> +}
> +
> +static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> +{
> +	aspeed_unregister_hace_hash_algs(hace_dev);
> +}
> +
> +static const struct of_device_id aspeed_hace_of_matches[] = {
> +	{ .compatible = "aspeed,ast2500-hace", .data = (void *)5, },
> +	{ .compatible = "aspeed,ast2600-hace", .data = (void *)6, },
> +	{},
> +};
> +
> +static int aspeed_hace_probe(struct platform_device *pdev)
> +{
> +	const struct of_device_id *hace_dev_id;
> +	struct aspeed_engine_hash *hash_engine;
> +	struct aspeed_hace_dev *hace_dev;
> +	struct resource *res;
> +	int rc;
> +
> +	hace_dev = devm_kzalloc(&pdev->dev, sizeof(struct aspeed_hace_dev),
> +				GFP_KERNEL);
> +	if (!hace_dev)
> +		return -ENOMEM;
> +
> +	hace_dev_id = of_match_device(aspeed_hace_of_matches, &pdev->dev);
> +	if (!hace_dev_id) {
> +		dev_err(&pdev->dev, "Failed to match hace dev id\n");
> +		return -EINVAL;
> +	}
> +
> +	hace_dev->dev = &pdev->dev;
> +	hace_dev->version = (unsigned long)hace_dev_id->data;
> +	hash_engine = &hace_dev->hash_engine;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +
> +	platform_set_drvdata(pdev, hace_dev);
> +
> +	hace_dev->regs = devm_ioremap_resource(&pdev->dev, res);
> +	if (!hace_dev->regs) {
> +		dev_err(&pdev->dev, "Failed to map resources\n");
> +		return -ENOMEM;
> +	}
> +
> +	/* Get irq number and register it */
> +	hace_dev->irq = platform_get_irq(pdev, 0);
> +	if (!hace_dev->irq) {
> +		dev_err(&pdev->dev, "Failed to get interrupt\n");
> +		return -ENXIO;
> +	}
> +
> +	rc = devm_request_irq(&pdev->dev, hace_dev->irq, aspeed_hace_irq, 0,
> +			      dev_name(&pdev->dev), hace_dev);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Failed to request interrupt\n");
> +		return rc;
> +	}
> +
> +	/* Get clk and enable it */
> +	hace_dev->clk = devm_clk_get(&pdev->dev, NULL);
> +	if (IS_ERR(hace_dev->clk)) {
> +		dev_err(&pdev->dev, "Failed to get clk\n");
> +		return -ENODEV;
> +	}
> +
> +	rc = clk_prepare_enable(hace_dev->clk);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Failed to enable clock 0x%x\n", rc);
> +		return rc;
> +	}
> +
> +	/* Initialize crypto hardware engine structure for hash */
> +	hace_dev->crypt_engine_hash = crypto_engine_alloc_init(hace_dev->dev,
> +							       true);
> +	if (!hace_dev->crypt_engine_hash) {
> +		rc = -ENOMEM;
> +		goto clk_exit;
> +	}
> +
> +	rc = crypto_engine_start(hace_dev->crypt_engine_hash);
> +	if (rc)
> +		goto err_engine_hash_start;
> +
> +	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
> +		     (unsigned long)hace_dev);
> +
> +	/* Allocate DMA buffer for hash engine input used */
> +	hash_engine->ahash_src_addr =
> +		dmam_alloc_coherent(&pdev->dev,
> +				    ASPEED_HASH_SRC_DMA_BUF_LEN,
> +				    &hash_engine->ahash_src_dma_addr,
> +				    GFP_KERNEL);
> +	if (!hash_engine->ahash_src_addr) {
> +		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
> +		rc = -ENOMEM;
> +		goto err_engine_hash_start;
> +	}
> +
> +	aspeed_hace_register(hace_dev);
> +
> +	dev_info(&pdev->dev, "Aspeed Crypto Accelerator successfully registered\n");
> +
> +	return 0;
> +
> +err_engine_hash_start:
> +	crypto_engine_exit(hace_dev->crypt_engine_hash);
> +clk_exit:
> +	clk_disable_unprepare(hace_dev->clk);
> +
> +	return rc;
> +}
> +
> +static int aspeed_hace_remove(struct platform_device *pdev)
> +{
> +	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +
> +	aspeed_hace_unregister(hace_dev);
> +
> +	crypto_engine_exit(hace_dev->crypt_engine_hash);
> +
> +	tasklet_kill(&hash_engine->done_task);
> +
> +	clk_disable_unprepare(hace_dev->clk);
> +
> +	return 0;
> +}
> +
> +MODULE_DEVICE_TABLE(of, aspeed_hace_of_matches);
> +
> +static struct platform_driver aspeed_hace_driver = {
> +	.probe		= aspeed_hace_probe,
> +	.remove		= aspeed_hace_remove,
> +	.driver         = {
> +		.name   = KBUILD_MODNAME,
> +		.of_match_table = aspeed_hace_of_matches,
> +	},
> +};
> +
> +module_platform_driver(aspeed_hace_driver);
> +
> +MODULE_AUTHOR("Neal Liu <neal_liu@aspeedtech.com>");
> +MODULE_DESCRIPTION("Aspeed HACE driver Crypto Accelerator");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
> new file mode 100644
> index 000000000000..3494ff22f69d
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace.h
> @@ -0,0 +1,186 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +#ifndef __ASPEED_HACE_H__
> +#define __ASPEED_HACE_H__
> +
> +#include <linux/interrupt.h>
> +#include <linux/delay.h>
> +#include <linux/err.h>
> +#include <linux/fips.h>
> +#include <linux/dma-mapping.h>
> +#include <crypto/scatterwalk.h>
> +#include <crypto/internal/aead.h>
> +#include <crypto/internal/akcipher.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/internal/kpp.h>
> +#include <crypto/internal/skcipher.h>
> +#include <crypto/algapi.h>
> +#include <crypto/engine.h>
> +#include <crypto/hmac.h>
> +#include <crypto/sha1.h>
> +#include <crypto/sha2.h>
> +
> +/*****************************
> + *                           *
> + * HACE register definitions *
> + *                           *
> + * ***************************/
> +
> +#define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
> +#define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
> +#define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
> +#define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
> +#define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
> +#define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
> +
> +/* interrupt status reg */
> +#define  HACE_HASH_ISR			BIT(9)
> +#define  HACE_HASH_BUSY			BIT(0)
> +
> +/* hash cmd reg */
> +#define  HASH_CMD_MBUS_REQ_SYNC_EN	BIT(20)
> +#define  HASH_CMD_HASH_SRC_SG_CTRL	BIT(18)
> +#define  HASH_CMD_SHA512_224		(0x3 << 10)
> +#define  HASH_CMD_SHA512_256		(0x2 << 10)
> +#define  HASH_CMD_SHA384		(0x1 << 10)
> +#define  HASH_CMD_SHA512		(0)
> +#define  HASH_CMD_INT_ENABLE		BIT(9)
> +#define  HASH_CMD_HMAC			(0x1 << 7)
> +#define  HASH_CMD_ACC_MODE		(0x2 << 7)
> +#define  HASH_CMD_HMAC_KEY		(0x3 << 7)
> +#define  HASH_CMD_SHA1			(0x2 << 4)
> +#define  HASH_CMD_SHA224		(0x4 << 4)
> +#define  HASH_CMD_SHA256		(0x5 << 4)
> +#define  HASH_CMD_SHA512_SER		(0x6 << 4)
> +#define  HASH_CMD_SHA_SWAP		(0x2 << 2)
> +
> +#define HASH_SG_LAST_LIST		BIT(31)
> +
> +#define CRYPTO_FLAGS_BUSY		BIT(1)
> +
> +#define SHA_OP_UPDATE			1
> +#define SHA_OP_FINAL			2
> +
> +#define SHA_FLAGS_SHA1			BIT(0)
> +#define SHA_FLAGS_SHA224		BIT(1)
> +#define SHA_FLAGS_SHA256		BIT(2)
> +#define SHA_FLAGS_SHA384		BIT(3)
> +#define SHA_FLAGS_SHA512		BIT(4)
> +#define SHA_FLAGS_SHA512_224		BIT(5)
> +#define SHA_FLAGS_SHA512_256		BIT(6)
> +#define SHA_FLAGS_HMAC			BIT(8)
> +#define SHA_FLAGS_FINUP			BIT(9)
> +#define SHA_FLAGS_MASK			(0xff)
> +
> +#define ASPEED_CRYPTO_SRC_DMA_BUF_LEN	0xa000
> +#define ASPEED_CRYPTO_DST_DMA_BUF_LEN	0xa000
> +#define ASPEED_CRYPTO_GCM_TAG_OFFSET	0x9ff0
> +#define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
> +#define ASPEED_HASH_QUEUE_LENGTH	50
> +
> +struct aspeed_hace_dev;
> +
> +typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
> +
> +struct aspeed_sg_list {
> +	__le32 len;
> +	__le32 phy_addr;
> +};
> +
> +struct aspeed_engine_hash {
> +	struct tasklet_struct		done_task;
> +	unsigned long			flags;
> +	struct ahash_request		*req;
> +
> +	/* input buffer */
> +	void				*ahash_src_addr;
> +	dma_addr_t			ahash_src_dma_addr;
> +
> +	dma_addr_t			src_dma;
> +	dma_addr_t			digest_dma;
> +
> +	size_t				src_length;
> +
> +	/* callback func */
> +	aspeed_hace_fn_t		resume;
> +	aspeed_hace_fn_t		dma_prepare;
> +};
> +
> +struct aspeed_sha_hmac_ctx {
> +	struct crypto_shash *shash;
> +	u8 ipad[SHA512_BLOCK_SIZE];
> +	u8 opad[SHA512_BLOCK_SIZE];
> +};
> +
> +struct aspeed_sham_ctx {
> +	struct crypto_engine_ctx	enginectx;
> +
> +	struct aspeed_hace_dev		*hace_dev;
> +	unsigned long			flags;	/* hmac flag */
> +
> +	struct aspeed_sha_hmac_ctx	base[0];
> +};
> +
> +struct aspeed_sham_reqctx {
> +	unsigned long		flags;		/* final update flag should no use*/
> +	unsigned long		op;		/* final or update */
> +	u32			cmd;		/* trigger cmd */
> +
> +	/* walk state */
> +	struct scatterlist	*src_sg;
> +	int			src_nents;
> +	unsigned int		offset;		/* offset in current sg */
> +	unsigned int		total;		/* per update length */
> +
> +	size_t			digsize;
> +	size_t			block_size;
> +	size_t			ivsize;
> +	const __be32		*sha_iv;
> +
> +	/* remain data buffer */
> +	u8			buffer[SHA512_BLOCK_SIZE * 2];
> +	dma_addr_t		buffer_dma_addr;
> +	size_t			bufcnt;		/* buffer counter */
> +
> +	/* output buffer */
> +	u8			digest[SHA512_DIGEST_SIZE] __aligned(64);
> +	dma_addr_t		digest_dma_addr;
> +	u64			digcnt[2];
> +};
> +
> +struct aspeed_hace_dev {
> +	void __iomem			*regs;
> +	struct device			*dev;
> +	int				irq;
> +	struct clk			*clk;
> +	unsigned long			version;
> +
> +	struct crypto_engine		*crypt_engine_hash;
> +
> +	struct aspeed_engine_hash	hash_engine;
> +};
> +
> +struct aspeed_hace_alg {
> +	struct aspeed_hace_dev		*hace_dev;
> +
> +	const char			*alg_base;
> +
> +	union {
> +		struct skcipher_alg	skcipher;
> +		struct ahash_alg	ahash;
> +	} alg;
> +};
> +
> +enum aspeed_version {
> +	AST2500_VERSION = 5,
> +	AST2600_VERSION
> +};
> +
> +#define ast_hace_write(hace, val, offset)	\
> +	writel((val), (hace)->regs + (offset))
> +#define ast_hace_read(hace, offset)		\
> +	readl((hace)->regs + (offset))
> +
> +void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
> +void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
> +
> +#endif
> 
Thanks.
Longfang.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-08  2:53     ` liulongfang
  0 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-08  2:53 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed, linux-crypto, devicetree, linux-arm-kernel,
	linux-kernel, BMC-SW


On 2022/7/26 19:34, Neal Liu wrote:
> Hash and Crypto Engine (HACE) is designed to accelerate the
> throughput of hash data digest, encryption, and decryption.
> 
> Basically, HACE can be divided into two independently engines
> - Hash Engine and Crypto Engine. This patch aims to add HACE
> hash engine driver for hash accelerator.
> 
> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> ---
>  MAINTAINERS                              |    7 +
>  drivers/crypto/Kconfig                   |    1 +
>  drivers/crypto/Makefile                  |    1 +
>  drivers/crypto/aspeed/Kconfig            |   32 +
>  drivers/crypto/aspeed/Makefile           |    6 +
>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389 ++++++++++++++++++++++
>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>  8 files changed, 1835 insertions(+)
>  create mode 100644 drivers/crypto/aspeed/Kconfig
>  create mode 100644 drivers/crypto/aspeed/Makefile
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f55aea311af5..23a0215b7e42 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3140,6 +3140,13 @@ S:	Maintained
>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>  F:	drivers/media/platform/aspeed/
>  
> +ASPEED CRYPTO DRIVER
> +M:	Neal Liu <neal_liu@aspeedtech.com>
> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> +S:	Maintained
> +F:	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> +F:	drivers/crypto/aspeed/
> +
>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>  M:	Corentin Chary <corentin.chary@gmail.com>
>  L:	acpi4asus-user@lists.sourceforge.net
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index ee99c02c84e8..b9f5ee126881 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>  	  acceleration for cryptographic algorithms on these devices.
>  
>  source "drivers/crypto/keembay/Kconfig"
> +source "drivers/crypto/aspeed/Kconfig"
>  
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index f81703a86b98..116de173a66c 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> new file mode 100644
> index 000000000000..059e627efef8
> --- /dev/null
> +++ b/drivers/crypto/aspeed/Kconfig
> @@ -0,0 +1,32 @@
> +config CRYPTO_DEV_ASPEED
> +	tristate "Support for Aspeed cryptographic engine driver"
> +	depends on ARCH_ASPEED
> +	help
> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> +	  throughput of hash data digest, encryption and decryption.
> +
> +	  Select y here to have support for the cryptographic driver
> +	  available on Aspeed SoC.
> +
> +config CRYPTO_DEV_ASPEED_HACE_HASH
> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> +	depends on CRYPTO_DEV_ASPEED
> +	select CRYPTO_ENGINE
> +	select CRYPTO_SHA1
> +	select CRYPTO_SHA256
> +	select CRYPTO_SHA512
> +	select CRYPTO_HMAC
> +	help
> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> +	  hash driver.
> +	  Supports multiple message digest standards, including
> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> +
> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> +	bool "Enable HACE hash debug messages"
> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> +	help
> +	  Print HACE hash debugging messages if you use this option
> +	  to ask for those messages.
> +	  Avoid enabling this option for production build to
> +	  minimize driver timing.
> diff --git a/drivers/crypto/aspeed/Makefile b/drivers/crypto/aspeed/Makefile
> new file mode 100644
> index 000000000000..8bc8d4fed5a9
> --- /dev/null
> +++ b/drivers/crypto/aspeed/Makefile
> @@ -0,0 +1,6 @@
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> +aspeed_crypto-objs := aspeed-hace.o \
> +		      $(hace-hash-y)
> +
> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) := aspeed-hace-hash.o
> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c b/drivers/crypto/aspeed/aspeed-hace-hash.c
> new file mode 100644
> index 000000000000..63a8ad694996
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> @@ -0,0 +1,1389 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (c) 2021 Aspeed Technology Inc.
> + */
> +
> +#include "aspeed-hace.h"
> +
> +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> +#define AHASH_DBG(h, fmt, ...)	\
> +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#else
> +#define AHASH_DBG(h, fmt, ...)	\
> +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#endif
> +
> +/* Initialization Vectors for SHA-family */
> +static const __be32 sha1_iv[8] = {
> +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
> +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
> +	cpu_to_be32(SHA1_H4), 0, 0, 0
> +};
> +
> +static const __be32 sha224_iv[8] = {
> +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
> +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
> +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
> +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
> +};
> +
> +static const __be32 sha256_iv[8] = {
> +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
> +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
> +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
> +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
> +};
> +
> +static const __be64 sha384_iv[8] = {
> +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
> +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
> +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
> +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
> +};
> +
> +static const __be64 sha512_iv[8] = {
> +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
> +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
> +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
> +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
> +};
> +
> +static const __be32 sha512_224_iv[16] = {
> +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
> +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
> +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
> +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
> +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
> +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
> +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
> +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
> +};
> +
> +static const __be32 sha512_256_iv[16] = {
> +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
> +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
> +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
> +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
> +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
> +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
> +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
> +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
> +};
> +
> +/* The purpose of this padding is to ensure that the padded message is a
> + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits (SHA384/SHA512).
> + * The bit "1" is appended at the end of the message followed by
> + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
> + * 128 bits block (SHA384/SHA512) equals to the message length in bits
> + * is appended.
> + *
> + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
> + *  - if message length < 56 bytes then padlen = 56 - message length
> + *  - else padlen = 64 + 56 - message length
> + *
> + * For SHA384/SHA512, padlen is calculated as followed:
> + *  - if message length < 112 bytes then padlen = 112 - message length
> + *  - else padlen = 128 + 112 - message length
> + */
> +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
> +				      struct aspeed_sham_reqctx *rctx)
> +{
> +	unsigned int index, padlen;
> +	__be64 bits[2];
> +
> +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
> +
> +	switch (rctx->flags & SHA_FLAGS_MASK) {
> +	case SHA_FLAGS_SHA1:
> +	case SHA_FLAGS_SHA224:
> +	case SHA_FLAGS_SHA256:
> +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
> +		index = rctx->bufcnt & 0x3f;
> +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
> +		rctx->bufcnt += padlen + 8;
> +		break;
> +	default:
> +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
> +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
> +				      rctx->digcnt[0] >> 61);
> +		index = rctx->bufcnt & 0x7f;
> +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
> +		rctx->bufcnt += padlen + 16;
> +		break;
> +	}
> +}
> +
> +/*
> + * Prepare DMA buffer before hardware engine
> + * processing.
> + */
> +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	int length, remain;
> +
> +	length = rctx->total + rctx->bufcnt;
> +	remain = length % rctx->block_size;
> +
> +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
> +
> +	if (rctx->bufcnt)
> +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
> +
> +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
> +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
> +					 rctx->bufcnt, rctx->src_sg,
> +					 rctx->offset, rctx->total - remain, 0);
> +		rctx->offset += rctx->total - remain;
> +
> +	} else {
> +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
> +		return -EINVAL;
> +	}
> +
> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
> +				 rctx->offset, remain, 0);
> +
> +	rctx->bufcnt = remain;
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		return -ENOMEM;
> +	}
> +
> +	hash_engine->src_length = length - remain;
> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	return 0;
> +}
> +
> +/*
> + * Prepare DMA buffer as SG list buffer before
> + * hardware engine processing.
> + */
> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct aspeed_sg_list *src_list;
> +	struct scatterlist *s;
> +	int length, remain, sg_len, i;
> +	int rc = 0;
> +
> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> +	length = rctx->total + rctx->bufcnt - remain;
> +
> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> +		  "length", length, "remain", remain);
> +
> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +			    DMA_TO_DEVICE);
> +	if (!sg_len) {
> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto free_src_sg;
> +	}
> +
> +	if (rctx->bufcnt != 0) {
> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> +						       rctx->buffer,
> +						       rctx->block_size * 2,
> +						       DMA_TO_DEVICE);
> +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +			rc = -ENOMEM;
> +			goto free_rctx_digest;
> +		}
> +
> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> +		src_list[0].len = rctx->bufcnt;
> +		length -= src_list[0].len;
> +
> +		/* Last sg list */
> +		if (length == 0)
> +			src_list[0].len |= HASH_SG_LAST_LIST;
> +
> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> +		src_list++;
> +	}
> +
> +	if (length != 0) {
> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> +			src_list[i].phy_addr = sg_dma_address(s);
> +
> +			if (length > sg_dma_len(s)) {
> +				src_list[i].len = sg_dma_len(s);
> +				length -= sg_dma_len(s);
> +
> +			} else {
> +				/* Last sg list */
> +				src_list[i].len = length;
> +				src_list[i].len |= HASH_SG_LAST_LIST;
> +				length = 0;
> +			}
> +
> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> +		}
> +	}
> +
> +	if (length != 0) {
> +		rc = -EINVAL;
> +		goto free_rctx_buffer;
> +	}
> +
> +	rctx->offset = rctx->total - remain;
> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	goto end;
Exiting via "goto xx" is not recommended in normal code logic (this requires two jumps),
exiting via "return 0" is more efficient.
This code method has many times in your entire driver, it is recommended to modify it.
> +
> +free_rctx_buffer:
> +	if (rctx->bufcnt != 0)
> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +free_src_sg:
> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +		     DMA_TO_DEVICE);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> +
> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> +
> +	return 0;
> +}
> +
> +/*
> + * Copy digest to the corresponding request result.
> + * This function will be called at final() stage.
> + */
> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> +
> +	memcpy(req->result, rctx->digest, rctx->digsize);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +/*
> + * Trigger hardware engines to do the math.
> + */
> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
> +				     aspeed_hace_fn_t resume)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x, length:0x%x\n",
> +		  hash_engine->src_dma, hash_engine->digest_dma,
> +		  hash_engine->src_length);
> +
> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> +	hash_engine->resume = resume;
> +
> +	ast_hace_write(hace_dev, hash_engine->src_dma, ASPEED_HACE_HASH_SRC);
> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> +		       ASPEED_HACE_HASH_KEY_BUFF);
> +	ast_hace_write(hace_dev, hash_engine->src_length,
> +		       ASPEED_HACE_HASH_DATA_LEN);
> +
> +	/* Memory barrier to ensure all data setup before engine starts */
> +	mb();
> +
> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
A hardware service sending requires 5 hardware commands to complete.
In a multi-concurrency scenario, how to ensure the order of commands?
(If two processes send hardware task at the same time,
How to ensure that the hardware recognizes which task the current
command belongs to?)
> +
> +	return -EINPROGRESS;
> +}
> +
> +/*
> + * HMAC resume aims to do the second pass produces
> + * the final HMAC code derived from the inner hash
> + * result and the outer key.
> + */
> +static int aspeed_ahash_hmac_resume(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +	int rc = 0;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> +
> +	/* o key pad + hash sum 1 */
> +	memcpy(rctx->buffer, bctx->opad, rctx->block_size);
> +	memcpy(rctx->buffer + rctx->block_size, rctx->digest, rctx->digsize);
> +
> +	rctx->bufcnt = rctx->block_size + rctx->digsize;
> +	rctx->digcnt[0] = rctx->block_size + rctx->digsize;
> +
> +	aspeed_ahash_fill_padding(hace_dev, rctx);
> +	memcpy(rctx->digest, rctx->sha_iv, rctx->ivsize);
> +
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev, rctx->buffer,
> +					       rctx->block_size * 2,
> +					       DMA_TO_DEVICE);
> +	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +		rc = -ENOMEM;
> +		goto free_rctx_digest;
> +	}
> +
> +	hash_engine->src_dma = rctx->buffer_dma_addr;
> +	hash_engine->src_length = rctx->bufcnt;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
> +
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_req_final(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	int rc = 0;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	aspeed_ahash_fill_padding(hace_dev, rctx);
> +
> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> +					       rctx->digest,
> +					       SHA512_DIGEST_SIZE,
> +					       DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> +		rc = -ENOMEM;
> +		goto end;
> +	}
> +
> +	rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> +					       rctx->buffer,
> +					       rctx->block_size * 2,
> +					       DMA_TO_DEVICE);
> +	if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> +		dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> +		rc = -ENOMEM;
> +		goto free_rctx_digest;
> +	}
> +
> +	hash_engine->src_dma = rctx->buffer_dma_addr;
> +	hash_engine->src_length = rctx->bufcnt;
> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> +
> +	if (rctx->flags & SHA_FLAGS_HMAC)
> +		return aspeed_hace_ahash_trigger(hace_dev,
> +						 aspeed_ahash_hmac_resume);
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, aspeed_ahash_transfer);
> +
> +free_rctx_digest:
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +end:
> +	return rc;
> +}
> +
> +static int aspeed_ahash_update_resume_sg(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> +		     DMA_TO_DEVICE);
> +
> +	if (rctx->bufcnt != 0)
> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> +				 rctx->block_size * 2,
> +				 DMA_TO_DEVICE);
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg, rctx->offset,
> +				 rctx->total - rctx->offset, 0);
> +
> +	rctx->bufcnt = rctx->total - rctx->offset;
> +	rctx->cmd &= ~HASH_CMD_HASH_SRC_SG_CTRL;
> +
> +	if (rctx->flags & SHA_FLAGS_FINUP)
> +		return aspeed_ahash_req_final(hace_dev);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +static int aspeed_ahash_update_resume(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> +
> +	if (rctx->flags & SHA_FLAGS_FINUP)
> +		return aspeed_ahash_req_final(hace_dev);
> +
> +	return aspeed_ahash_complete(hace_dev);
> +}
> +
> +static int aspeed_ahash_req_update(struct aspeed_hace_dev *hace_dev)
> +{
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	struct ahash_request *req = hash_engine->req;
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	aspeed_hace_fn_t resume;
> +	int ret;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	if (hace_dev->version == AST2600_VERSION) {
> +		rctx->cmd |= HASH_CMD_HASH_SRC_SG_CTRL;
> +		resume = aspeed_ahash_update_resume_sg;
> +
> +	} else {
> +		resume = aspeed_ahash_update_resume;
> +	}
> +
> +	ret = hash_engine->dma_prepare(hace_dev);
> +	if (ret)
> +		return ret;
> +
> +	return aspeed_hace_ahash_trigger(hace_dev, resume);
> +}
> +
> +static int aspeed_hace_hash_handle_queue(struct aspeed_hace_dev *hace_dev,
> +				  struct ahash_request *req)
> +{
> +	return crypto_transfer_hash_request_to_engine(
> +			hace_dev->crypt_engine_hash, req);
> +}
> +
> +static int aspeed_ahash_do_request(struct crypto_engine *engine, void *areq)
> +{
> +	struct ahash_request *req = ahash_request_cast(areq);
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_engine_hash *hash_engine;
> +	int ret = 0;
> +
> +	hash_engine = &hace_dev->hash_engine;
> +	hash_engine->flags |= CRYPTO_FLAGS_BUSY;
> +
> +	if (rctx->op == SHA_OP_UPDATE)
> +		ret = aspeed_ahash_req_update(hace_dev);
> +	else if (rctx->op == SHA_OP_FINAL)
> +		ret = aspeed_ahash_req_final(hace_dev);
> +
> +	if (ret != -EINPROGRESS)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int aspeed_ahash_prepare_request(struct crypto_engine *engine,
> +					void *areq)
> +{
> +	struct ahash_request *req = ahash_request_cast(areq);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_engine_hash *hash_engine;
> +
> +	hash_engine = &hace_dev->hash_engine;
> +	hash_engine->req = req;
> +
> +	if (hace_dev->version == AST2600_VERSION)
> +		hash_engine->dma_prepare = aspeed_ahash_dma_prepare_sg;
> +	else
> +		hash_engine->dma_prepare = aspeed_ahash_dma_prepare;
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_update(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
> +
> +	rctx->total = req->nbytes;
> +	rctx->src_sg = req->src;
> +	rctx->offset = 0;
> +	rctx->src_nents = sg_nents(req->src);
> +	rctx->op = SHA_OP_UPDATE;
> +
> +	rctx->digcnt[0] += rctx->total;
> +	if (rctx->digcnt[0] < rctx->total)
> +		rctx->digcnt[1]++;
> +
> +	if (rctx->bufcnt + rctx->total < rctx->block_size) {
> +		scatterwalk_map_and_copy(rctx->buffer + rctx->bufcnt,
> +					 rctx->src_sg, rctx->offset,
> +					 rctx->total, 0);
> +		rctx->bufcnt += rctx->total;
> +
> +		return 0;
> +	}
> +
> +	return aspeed_hace_hash_handle_queue(hace_dev, req);
> +}
> +
> +static int aspeed_sham_shash_digest(struct crypto_shash *tfm, u32 flags,
> +				    const u8 *data, unsigned int len, u8 *out)
> +{
> +	SHASH_DESC_ON_STACK(shash, tfm);
> +
> +	shash->tfm = tfm;
> +
> +	return crypto_shash_digest(shash, data, len, out);
> +}
> +
> +static int aspeed_sham_final(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes:%d, rctx->total:%d\n",
> +		  req->nbytes, rctx->total);
> +	rctx->op = SHA_OP_FINAL;
> +
> +	return aspeed_hace_hash_handle_queue(hace_dev, req);
> +}
> +
> +static int aspeed_sham_finup(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	int rc1, rc2;
> +
> +	AHASH_DBG(hace_dev, "req->nbytes: %d\n", req->nbytes);
> +
> +	rctx->flags |= SHA_FLAGS_FINUP;
> +
> +	rc1 = aspeed_sham_update(req);
> +	if (rc1 == -EINPROGRESS || rc1 == -EBUSY)
> +		return rc1;
> +
> +	/*
> +	 * final() has to be always called to cleanup resources
> +	 * even if update() failed, except EINPROGRESS
> +	 */
> +	rc2 = aspeed_sham_final(req);
> +
> +	return rc1 ? : rc2;
> +}
> +
> +static int aspeed_sham_init(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +	AHASH_DBG(hace_dev, "%s: digest size:%d\n",
> +		  crypto_tfm_alg_name(&tfm->base),
> +		  crypto_ahash_digestsize(tfm));
> +
> +	rctx->cmd = HASH_CMD_ACC_MODE;
> +	rctx->flags = 0;
> +
> +	switch (crypto_ahash_digestsize(tfm)) {
> +	case SHA1_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA1 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA1;
> +		rctx->digsize = SHA1_DIGEST_SIZE;
> +		rctx->block_size = SHA1_BLOCK_SIZE;
> +		rctx->sha_iv = sha1_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha1_iv, rctx->ivsize);
> +		break;
> +	case SHA224_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA224 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA224;
> +		rctx->digsize = SHA224_DIGEST_SIZE;
> +		rctx->block_size = SHA224_BLOCK_SIZE;
> +		rctx->sha_iv = sha224_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha224_iv, rctx->ivsize);
> +		break;
> +	case SHA256_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA256 | HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA256;
> +		rctx->digsize = SHA256_DIGEST_SIZE;
> +		rctx->block_size = SHA256_BLOCK_SIZE;
> +		rctx->sha_iv = sha256_iv;
> +		rctx->ivsize = 32;
> +		memcpy(rctx->digest, sha256_iv, rctx->ivsize);
> +		break;
> +	case SHA384_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA384 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA384;
> +		rctx->digsize = SHA384_DIGEST_SIZE;
> +		rctx->block_size = SHA384_BLOCK_SIZE;
> +		rctx->sha_iv = (const __be32 *)sha384_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha384_iv, rctx->ivsize);
> +		break;
> +	case SHA512_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512;
> +		rctx->digsize = SHA512_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = (const __be32 *)sha512_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_iv, rctx->ivsize);
> +		break;
> +	default:
> +		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
> +			 crypto_ahash_digestsize(tfm));
> +		return -EINVAL;
> +	}
> +
> +	rctx->bufcnt = 0;
> +	rctx->total = 0;
> +	rctx->digcnt[0] = 0;
> +	rctx->digcnt[1] = 0;
> +
> +	/* HMAC init */
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		rctx->digcnt[0] = rctx->block_size;
> +		rctx->bufcnt = rctx->block_size;
> +		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
> +		rctx->flags |= SHA_FLAGS_HMAC;
> +	}
> +
> +	return 0;
> +}
> +
> +static int aspeed_sha512s_init(struct ahash_request *req)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +	AHASH_DBG(hace_dev, "digest size: %d\n", crypto_ahash_digestsize(tfm));
> +
> +	rctx->cmd = HASH_CMD_ACC_MODE;
> +	rctx->flags = 0;
> +
> +	switch (crypto_ahash_digestsize(tfm)) {
> +	case SHA224_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_224 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512_224;
> +		rctx->digsize = SHA224_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = sha512_224_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_224_iv, rctx->ivsize);
> +		break;
> +	case SHA256_DIGEST_SIZE:
> +		rctx->cmd |= HASH_CMD_SHA512_SER | HASH_CMD_SHA512_256 |
> +			     HASH_CMD_SHA_SWAP;
> +		rctx->flags |= SHA_FLAGS_SHA512_256;
> +		rctx->digsize = SHA256_DIGEST_SIZE;
> +		rctx->block_size = SHA512_BLOCK_SIZE;
> +		rctx->sha_iv = sha512_256_iv;
> +		rctx->ivsize = 64;
> +		memcpy(rctx->digest, sha512_256_iv, rctx->ivsize);
> +		break;
> +	default:
> +		dev_warn(tctx->hace_dev->dev, "digest size %d not support\n",
> +			 crypto_ahash_digestsize(tfm));
> +		return -EINVAL;
> +	}
> +
> +	rctx->bufcnt = 0;
> +	rctx->total = 0;
> +	rctx->digcnt[0] = 0;
> +	rctx->digcnt[1] = 0;
> +
> +	/* HMAC init */
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		rctx->digcnt[0] = rctx->block_size;
> +		rctx->bufcnt = rctx->block_size;
> +		memcpy(rctx->buffer, bctx->ipad, rctx->block_size);
> +		rctx->flags |= SHA_FLAGS_HMAC;
> +	}
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_digest(struct ahash_request *req)
> +{
> +	return aspeed_sham_init(req) ? : aspeed_sham_finup(req);
> +}
> +
> +static int aspeed_sham_setkey(struct crypto_ahash *tfm, const u8 *key,
> +			      unsigned int keylen)
> +{
> +	struct aspeed_sham_ctx *tctx = crypto_ahash_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +	struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +	int ds = crypto_shash_digestsize(bctx->shash);
> +	int bs = crypto_shash_blocksize(bctx->shash);
> +	int err = 0;
> +	int i;
> +
> +	AHASH_DBG(hace_dev, "%s: keylen:%d\n", crypto_tfm_alg_name(&tfm->base),
> +		  keylen);
> +
> +	if (keylen > bs) {
> +		err = aspeed_sham_shash_digest(bctx->shash,
> +					       crypto_shash_get_flags(bctx->shash),
> +					       key, keylen, bctx->ipad);
> +		if (err)
> +			return err;
> +		keylen = ds;
> +
> +	} else {
> +		memcpy(bctx->ipad, key, keylen);
> +	}
> +
> +	memset(bctx->ipad + keylen, 0, bs - keylen);
> +	memcpy(bctx->opad, bctx->ipad, bs);
> +
> +	for (i = 0; i < bs; i++) {
> +		bctx->ipad[i] ^= HMAC_IPAD_VALUE;
> +		bctx->opad[i] ^= HMAC_OPAD_VALUE;
> +	}
> +
> +	return err;
> +}
> +
> +static int aspeed_sham_cra_init(struct crypto_tfm *tfm)
> +{
> +	struct ahash_alg *alg = __crypto_ahash_alg(tfm->__crt_alg);
> +	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> +	struct aspeed_hace_alg *ast_alg;
> +
> +	ast_alg = container_of(alg, struct aspeed_hace_alg, alg.ahash);
> +	tctx->hace_dev = ast_alg->hace_dev;
> +	tctx->flags = 0;
> +
> +	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
> +				 sizeof(struct aspeed_sham_reqctx));
> +
> +	if (ast_alg->alg_base) {
> +		/* hmac related */
> +		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +		tctx->flags |= SHA_FLAGS_HMAC;
> +		bctx->shash = crypto_alloc_shash(ast_alg->alg_base, 0,
> +						 CRYPTO_ALG_NEED_FALLBACK);
> +		if (IS_ERR(bctx->shash)) {
> +			dev_warn(ast_alg->hace_dev->dev,
> +				 "base driver '%s' could not be loaded.\n",
> +				 ast_alg->alg_base);
> +			return PTR_ERR(bctx->shash);
> +		}
> +	}
> +
> +	tctx->enginectx.op.do_one_request = aspeed_ahash_do_request;
> +	tctx->enginectx.op.prepare_request = aspeed_ahash_prepare_request;
> +	tctx->enginectx.op.unprepare_request = NULL;
> +
> +	return 0;
> +}
> +
> +static void aspeed_sham_cra_exit(struct crypto_tfm *tfm)
> +{
> +	struct aspeed_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> +	struct aspeed_hace_dev *hace_dev = tctx->hace_dev;
> +
> +	AHASH_DBG(hace_dev, "%s\n", crypto_tfm_alg_name(tfm));
> +
> +	if (tctx->flags & SHA_FLAGS_HMAC) {
> +		struct aspeed_sha_hmac_ctx *bctx = tctx->base;
> +
> +		crypto_free_shash(bctx->shash);
> +	}
> +}
> +
> +static int aspeed_sham_export(struct ahash_request *req, void *out)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	memcpy(out, rctx, sizeof(*rctx));
> +
> +	return 0;
> +}
> +
> +static int aspeed_sham_import(struct ahash_request *req, const void *in)
> +{
> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	memcpy(rctx, in, sizeof(*rctx));
> +
> +	return 0;
> +}
> +
> +struct aspeed_hace_alg aspeed_ahash_algs[] = {
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA1_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha1",
> +					.cra_driver_name	= "aspeed-sha1",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA1_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha256",
> +					.cra_driver_name	= "aspeed-sha256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA256_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha224",
> +					.cra_driver_name	= "aspeed-sha224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA224_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha1",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA1_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha1)",
> +					.cra_driver_name	= "aspeed-hmac-sha1",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA1_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha224",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha224)",
> +					.cra_driver_name	= "aspeed-hmac-sha224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA224_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha256",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha256)",
> +					.cra_driver_name	= "aspeed-hmac-sha256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA256_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +};
> +
> +struct aspeed_hace_alg aspeed_ahash_algs_g6[] = {
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA384_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha384",
> +					.cra_driver_name	= "aspeed-sha384",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA384_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA512_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512",
> +					.cra_driver_name	= "aspeed-sha512",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512_224",
> +					.cra_driver_name	= "aspeed-sha512_224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "sha512_256",
> +					.cra_driver_name	= "aspeed-sha512_256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha384",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA384_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha384)",
> +					.cra_driver_name	= "aspeed-hmac-sha384",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA384_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512",
> +		.alg.ahash = {
> +			.init	= aspeed_sham_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA512_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512)",
> +					.cra_driver_name	= "aspeed-hmac-sha512",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512_224",
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA224_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512_224)",
> +					.cra_driver_name	= "aspeed-hmac-sha512_224",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +	{
> +		.alg_base = "sha512_256",
> +		.alg.ahash = {
> +			.init	= aspeed_sha512s_init,
> +			.update	= aspeed_sham_update,
> +			.final	= aspeed_sham_final,
> +			.finup	= aspeed_sham_finup,
> +			.digest	= aspeed_sham_digest,
> +			.setkey	= aspeed_sham_setkey,
> +			.export	= aspeed_sham_export,
> +			.import	= aspeed_sham_import,
> +			.halg = {
> +				.digestsize = SHA256_DIGEST_SIZE,
> +				.statesize = sizeof(struct aspeed_sham_reqctx),
> +				.base = {
> +					.cra_name		= "hmac(sha512_256)",
> +					.cra_driver_name	= "aspeed-hmac-sha512_256",
> +					.cra_priority		= 300,
> +					.cra_flags		= CRYPTO_ALG_TYPE_AHASH |
> +								  CRYPTO_ALG_ASYNC |
> +								  CRYPTO_ALG_KERN_DRIVER_ONLY,
> +					.cra_blocksize		= SHA512_BLOCK_SIZE,
> +					.cra_ctxsize		= sizeof(struct aspeed_sham_ctx) +
> +								sizeof(struct aspeed_sha_hmac_ctx),
> +					.cra_alignmask		= 0,
> +					.cra_module		= THIS_MODULE,
> +					.cra_init		= aspeed_sham_cra_init,
> +					.cra_exit		= aspeed_sham_cra_exit,
> +				}
> +			}
> +		},
> +	},
> +};
> +
> +void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++)
> +		crypto_unregister_ahash(&aspeed_ahash_algs[i].alg.ahash);
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++)
> +		crypto_unregister_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
> +}
> +
> +void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	int rc, i;
> +
> +	AHASH_DBG(hace_dev, "\n");
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs); i++) {
> +		aspeed_ahash_algs[i].hace_dev = hace_dev;
> +		rc = crypto_register_ahash(&aspeed_ahash_algs[i].alg.ahash);
> +		if (rc) {
> +			AHASH_DBG(hace_dev, "Failed to register %s\n",
> +				  aspeed_ahash_algs[i].alg.ahash.halg.base.cra_name);
> +		}
> +	}
> +
> +	if (hace_dev->version != AST2600_VERSION)
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(aspeed_ahash_algs_g6); i++) {
> +		aspeed_ahash_algs_g6[i].hace_dev = hace_dev;
> +		rc = crypto_register_ahash(&aspeed_ahash_algs_g6[i].alg.ahash);
> +		if (rc) {
> +			AHASH_DBG(hace_dev, "Failed to register %s\n",
> +				  aspeed_ahash_algs_g6[i].alg.ahash.halg.base.cra_name);
> +		}
> +	}
> +}
> diff --git a/drivers/crypto/aspeed/aspeed-hace.c b/drivers/crypto/aspeed/aspeed-hace.c
> new file mode 100644
> index 000000000000..89b1585d72e2
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace.c
> @@ -0,0 +1,213 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (c) 2021 Aspeed Technology Inc.
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/of_irq.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +
> +#include "aspeed-hace.h"
> +
> +#ifdef ASPEED_HACE_DEBUG
> +#define HACE_DBG(d, fmt, ...)	\
> +	dev_info((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#else
> +#define HACE_DBG(d, fmt, ...)	\
> +	dev_dbg((d)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> +#endif
> +
> +/* Weak function for HACE hash */
> +void __weak aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +void __weak aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev)
> +{
> +	dev_warn(hace_dev->dev, "%s: Not supported yet\n", __func__);
> +}
> +
> +/* HACE interrupt service routine */
> +static irqreturn_t aspeed_hace_irq(int irq, void *dev)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)dev;
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +	u32 sts;
> +
> +	sts = ast_hace_read(hace_dev, ASPEED_HACE_STS);
> +	ast_hace_write(hace_dev, sts, ASPEED_HACE_STS);
> +
> +	HACE_DBG(hace_dev, "irq status: 0x%x\n", sts);
> +
> +	if (sts & HACE_HASH_ISR) {
> +		if (hash_engine->flags & CRYPTO_FLAGS_BUSY)
> +			tasklet_schedule(&hash_engine->done_task);
> +		else
> +			dev_warn(hace_dev->dev, "HASH no active requests.\n");
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void aspeed_hace_hash_done_task(unsigned long data)
> +{
> +	struct aspeed_hace_dev *hace_dev = (struct aspeed_hace_dev *)data;
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +
> +	hash_engine->resume(hace_dev);
> +}
> +
> +static void aspeed_hace_register(struct aspeed_hace_dev *hace_dev)
> +{
> +	aspeed_register_hace_hash_algs(hace_dev);
> +}
> +
> +static void aspeed_hace_unregister(struct aspeed_hace_dev *hace_dev)
> +{
> +	aspeed_unregister_hace_hash_algs(hace_dev);
> +}
> +
> +static const struct of_device_id aspeed_hace_of_matches[] = {
> +	{ .compatible = "aspeed,ast2500-hace", .data = (void *)5, },
> +	{ .compatible = "aspeed,ast2600-hace", .data = (void *)6, },
> +	{},
> +};
> +
> +static int aspeed_hace_probe(struct platform_device *pdev)
> +{
> +	const struct of_device_id *hace_dev_id;
> +	struct aspeed_engine_hash *hash_engine;
> +	struct aspeed_hace_dev *hace_dev;
> +	struct resource *res;
> +	int rc;
> +
> +	hace_dev = devm_kzalloc(&pdev->dev, sizeof(struct aspeed_hace_dev),
> +				GFP_KERNEL);
> +	if (!hace_dev)
> +		return -ENOMEM;
> +
> +	hace_dev_id = of_match_device(aspeed_hace_of_matches, &pdev->dev);
> +	if (!hace_dev_id) {
> +		dev_err(&pdev->dev, "Failed to match hace dev id\n");
> +		return -EINVAL;
> +	}
> +
> +	hace_dev->dev = &pdev->dev;
> +	hace_dev->version = (unsigned long)hace_dev_id->data;
> +	hash_engine = &hace_dev->hash_engine;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +
> +	platform_set_drvdata(pdev, hace_dev);
> +
> +	hace_dev->regs = devm_ioremap_resource(&pdev->dev, res);
> +	if (!hace_dev->regs) {
> +		dev_err(&pdev->dev, "Failed to map resources\n");
> +		return -ENOMEM;
> +	}
> +
> +	/* Get irq number and register it */
> +	hace_dev->irq = platform_get_irq(pdev, 0);
> +	if (!hace_dev->irq) {
> +		dev_err(&pdev->dev, "Failed to get interrupt\n");
> +		return -ENXIO;
> +	}
> +
> +	rc = devm_request_irq(&pdev->dev, hace_dev->irq, aspeed_hace_irq, 0,
> +			      dev_name(&pdev->dev), hace_dev);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Failed to request interrupt\n");
> +		return rc;
> +	}
> +
> +	/* Get clk and enable it */
> +	hace_dev->clk = devm_clk_get(&pdev->dev, NULL);
> +	if (IS_ERR(hace_dev->clk)) {
> +		dev_err(&pdev->dev, "Failed to get clk\n");
> +		return -ENODEV;
> +	}
> +
> +	rc = clk_prepare_enable(hace_dev->clk);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Failed to enable clock 0x%x\n", rc);
> +		return rc;
> +	}
> +
> +	/* Initialize crypto hardware engine structure for hash */
> +	hace_dev->crypt_engine_hash = crypto_engine_alloc_init(hace_dev->dev,
> +							       true);
> +	if (!hace_dev->crypt_engine_hash) {
> +		rc = -ENOMEM;
> +		goto clk_exit;
> +	}
> +
> +	rc = crypto_engine_start(hace_dev->crypt_engine_hash);
> +	if (rc)
> +		goto err_engine_hash_start;
> +
> +	tasklet_init(&hash_engine->done_task, aspeed_hace_hash_done_task,
> +		     (unsigned long)hace_dev);
> +
> +	/* Allocate DMA buffer for hash engine input used */
> +	hash_engine->ahash_src_addr =
> +		dmam_alloc_coherent(&pdev->dev,
> +				    ASPEED_HASH_SRC_DMA_BUF_LEN,
> +				    &hash_engine->ahash_src_dma_addr,
> +				    GFP_KERNEL);
> +	if (!hash_engine->ahash_src_addr) {
> +		dev_err(&pdev->dev, "Failed to allocate dma buffer\n");
> +		rc = -ENOMEM;
> +		goto err_engine_hash_start;
> +	}
> +
> +	aspeed_hace_register(hace_dev);
> +
> +	dev_info(&pdev->dev, "Aspeed Crypto Accelerator successfully registered\n");
> +
> +	return 0;
> +
> +err_engine_hash_start:
> +	crypto_engine_exit(hace_dev->crypt_engine_hash);
> +clk_exit:
> +	clk_disable_unprepare(hace_dev->clk);
> +
> +	return rc;
> +}
> +
> +static int aspeed_hace_remove(struct platform_device *pdev)
> +{
> +	struct aspeed_hace_dev *hace_dev = platform_get_drvdata(pdev);
> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> +
> +	aspeed_hace_unregister(hace_dev);
> +
> +	crypto_engine_exit(hace_dev->crypt_engine_hash);
> +
> +	tasklet_kill(&hash_engine->done_task);
> +
> +	clk_disable_unprepare(hace_dev->clk);
> +
> +	return 0;
> +}
> +
> +MODULE_DEVICE_TABLE(of, aspeed_hace_of_matches);
> +
> +static struct platform_driver aspeed_hace_driver = {
> +	.probe		= aspeed_hace_probe,
> +	.remove		= aspeed_hace_remove,
> +	.driver         = {
> +		.name   = KBUILD_MODNAME,
> +		.of_match_table = aspeed_hace_of_matches,
> +	},
> +};
> +
> +module_platform_driver(aspeed_hace_driver);
> +
> +MODULE_AUTHOR("Neal Liu <neal_liu@aspeedtech.com>");
> +MODULE_DESCRIPTION("Aspeed HACE driver Crypto Accelerator");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/crypto/aspeed/aspeed-hace.h b/drivers/crypto/aspeed/aspeed-hace.h
> new file mode 100644
> index 000000000000..3494ff22f69d
> --- /dev/null
> +++ b/drivers/crypto/aspeed/aspeed-hace.h
> @@ -0,0 +1,186 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +#ifndef __ASPEED_HACE_H__
> +#define __ASPEED_HACE_H__
> +
> +#include <linux/interrupt.h>
> +#include <linux/delay.h>
> +#include <linux/err.h>
> +#include <linux/fips.h>
> +#include <linux/dma-mapping.h>
> +#include <crypto/scatterwalk.h>
> +#include <crypto/internal/aead.h>
> +#include <crypto/internal/akcipher.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/internal/kpp.h>
> +#include <crypto/internal/skcipher.h>
> +#include <crypto/algapi.h>
> +#include <crypto/engine.h>
> +#include <crypto/hmac.h>
> +#include <crypto/sha1.h>
> +#include <crypto/sha2.h>
> +
> +/*****************************
> + *                           *
> + * HACE register definitions *
> + *                           *
> + * ***************************/
> +
> +#define ASPEED_HACE_STS			0x1C	/* HACE Status Register */
> +#define ASPEED_HACE_HASH_SRC		0x20	/* Hash Data Source Base Address Register */
> +#define ASPEED_HACE_HASH_DIGEST_BUFF	0x24	/* Hash Digest Write Buffer Base Address Register */
> +#define ASPEED_HACE_HASH_KEY_BUFF	0x28	/* Hash HMAC Key Buffer Base Address Register */
> +#define ASPEED_HACE_HASH_DATA_LEN	0x2C	/* Hash Data Length Register */
> +#define ASPEED_HACE_HASH_CMD		0x30	/* Hash Engine Command Register */
> +
> +/* interrupt status reg */
> +#define  HACE_HASH_ISR			BIT(9)
> +#define  HACE_HASH_BUSY			BIT(0)
> +
> +/* hash cmd reg */
> +#define  HASH_CMD_MBUS_REQ_SYNC_EN	BIT(20)
> +#define  HASH_CMD_HASH_SRC_SG_CTRL	BIT(18)
> +#define  HASH_CMD_SHA512_224		(0x3 << 10)
> +#define  HASH_CMD_SHA512_256		(0x2 << 10)
> +#define  HASH_CMD_SHA384		(0x1 << 10)
> +#define  HASH_CMD_SHA512		(0)
> +#define  HASH_CMD_INT_ENABLE		BIT(9)
> +#define  HASH_CMD_HMAC			(0x1 << 7)
> +#define  HASH_CMD_ACC_MODE		(0x2 << 7)
> +#define  HASH_CMD_HMAC_KEY		(0x3 << 7)
> +#define  HASH_CMD_SHA1			(0x2 << 4)
> +#define  HASH_CMD_SHA224		(0x4 << 4)
> +#define  HASH_CMD_SHA256		(0x5 << 4)
> +#define  HASH_CMD_SHA512_SER		(0x6 << 4)
> +#define  HASH_CMD_SHA_SWAP		(0x2 << 2)
> +
> +#define HASH_SG_LAST_LIST		BIT(31)
> +
> +#define CRYPTO_FLAGS_BUSY		BIT(1)
> +
> +#define SHA_OP_UPDATE			1
> +#define SHA_OP_FINAL			2
> +
> +#define SHA_FLAGS_SHA1			BIT(0)
> +#define SHA_FLAGS_SHA224		BIT(1)
> +#define SHA_FLAGS_SHA256		BIT(2)
> +#define SHA_FLAGS_SHA384		BIT(3)
> +#define SHA_FLAGS_SHA512		BIT(4)
> +#define SHA_FLAGS_SHA512_224		BIT(5)
> +#define SHA_FLAGS_SHA512_256		BIT(6)
> +#define SHA_FLAGS_HMAC			BIT(8)
> +#define SHA_FLAGS_FINUP			BIT(9)
> +#define SHA_FLAGS_MASK			(0xff)
> +
> +#define ASPEED_CRYPTO_SRC_DMA_BUF_LEN	0xa000
> +#define ASPEED_CRYPTO_DST_DMA_BUF_LEN	0xa000
> +#define ASPEED_CRYPTO_GCM_TAG_OFFSET	0x9ff0
> +#define ASPEED_HASH_SRC_DMA_BUF_LEN	0xa000
> +#define ASPEED_HASH_QUEUE_LENGTH	50
> +
> +struct aspeed_hace_dev;
> +
> +typedef int (*aspeed_hace_fn_t)(struct aspeed_hace_dev *);
> +
> +struct aspeed_sg_list {
> +	__le32 len;
> +	__le32 phy_addr;
> +};
> +
> +struct aspeed_engine_hash {
> +	struct tasklet_struct		done_task;
> +	unsigned long			flags;
> +	struct ahash_request		*req;
> +
> +	/* input buffer */
> +	void				*ahash_src_addr;
> +	dma_addr_t			ahash_src_dma_addr;
> +
> +	dma_addr_t			src_dma;
> +	dma_addr_t			digest_dma;
> +
> +	size_t				src_length;
> +
> +	/* callback func */
> +	aspeed_hace_fn_t		resume;
> +	aspeed_hace_fn_t		dma_prepare;
> +};
> +
> +struct aspeed_sha_hmac_ctx {
> +	struct crypto_shash *shash;
> +	u8 ipad[SHA512_BLOCK_SIZE];
> +	u8 opad[SHA512_BLOCK_SIZE];
> +};
> +
> +struct aspeed_sham_ctx {
> +	struct crypto_engine_ctx	enginectx;
> +
> +	struct aspeed_hace_dev		*hace_dev;
> +	unsigned long			flags;	/* hmac flag */
> +
> +	struct aspeed_sha_hmac_ctx	base[0];
> +};
> +
> +struct aspeed_sham_reqctx {
> +	unsigned long		flags;		/* final update flag should no use*/
> +	unsigned long		op;		/* final or update */
> +	u32			cmd;		/* trigger cmd */
> +
> +	/* walk state */
> +	struct scatterlist	*src_sg;
> +	int			src_nents;
> +	unsigned int		offset;		/* offset in current sg */
> +	unsigned int		total;		/* per update length */
> +
> +	size_t			digsize;
> +	size_t			block_size;
> +	size_t			ivsize;
> +	const __be32		*sha_iv;
> +
> +	/* remain data buffer */
> +	u8			buffer[SHA512_BLOCK_SIZE * 2];
> +	dma_addr_t		buffer_dma_addr;
> +	size_t			bufcnt;		/* buffer counter */
> +
> +	/* output buffer */
> +	u8			digest[SHA512_DIGEST_SIZE] __aligned(64);
> +	dma_addr_t		digest_dma_addr;
> +	u64			digcnt[2];
> +};
> +
> +struct aspeed_hace_dev {
> +	void __iomem			*regs;
> +	struct device			*dev;
> +	int				irq;
> +	struct clk			*clk;
> +	unsigned long			version;
> +
> +	struct crypto_engine		*crypt_engine_hash;
> +
> +	struct aspeed_engine_hash	hash_engine;
> +};
> +
> +struct aspeed_hace_alg {
> +	struct aspeed_hace_dev		*hace_dev;
> +
> +	const char			*alg_base;
> +
> +	union {
> +		struct skcipher_alg	skcipher;
> +		struct ahash_alg	ahash;
> +	} alg;
> +};
> +
> +enum aspeed_version {
> +	AST2500_VERSION = 5,
> +	AST2600_VERSION
> +};
> +
> +#define ast_hace_write(hace, val, offset)	\
> +	writel((val), (hace)->regs + (offset))
> +#define ast_hace_read(hace, offset)		\
> +	readl((hace)->regs + (offset))
> +
> +void aspeed_register_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
> +void aspeed_unregister_hace_hash_algs(struct aspeed_hace_dev *hace_dev);
> +
> +#endif
> 
Thanks.
Longfang.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-08-08  2:53     ` liulongfang
@ 2022-08-08  9:30       ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-08  9:30 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> -----Original Message-----
> From: liulongfang <liulongfang@huawei.com>
> Sent: Monday, August 8, 2022 10:53 AM
> To: Neal Liu <neal_liu@aspeedtech.com>; Corentin Labbe
> <clabbe.montjoie@gmail.com>; Christophe JAILLET
> <christophe.jaillet@wanadoo.fr>; Randy Dunlap <rdunlap@infradead.org>;
> Herbert Xu <herbert@gondor.apana.org.au>; David S . Miller
> <davem@davemloft.net>; Rob Herring <robh+dt@kernel.org>; Krzysztof
> Kozlowski <krzysztof.kozlowski+dt@linaro.org>; Joel Stanley <joel@jms.id.au>;
> Andrew Jeffery <andrew@aj.id.au>; Dhananjay Phadke
> <dhphadke@microsoft.com>; Johnny Huang
> <johnny_huang@aspeedtech.com>
> Cc: linux-aspeed@lists.ozlabs.org; linux-crypto@vger.kernel.org;
> devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> linux-kernel@vger.kernel.org; BMC-SW <BMC-SW@aspeedtech.com>
> Subject: Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
> 
> 
> On 2022/7/26 19:34, Neal Liu wrote:
> > Hash and Crypto Engine (HACE) is designed to accelerate the
> > throughput of hash data digest, encryption, and decryption.
> >
> > Basically, HACE can be divided into two independently engines
> > - Hash Engine and Crypto Engine. This patch aims to add HACE
> > hash engine driver for hash accelerator.
> >
> > Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> > Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> > ---
> >  MAINTAINERS                              |    7 +
> >  drivers/crypto/Kconfig                   |    1 +
> >  drivers/crypto/Makefile                  |    1 +
> >  drivers/crypto/aspeed/Kconfig            |   32 +
> >  drivers/crypto/aspeed/Makefile           |    6 +
> >  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> ++++++++++++++++++++++
> >  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >  8 files changed, 1835 insertions(+)
> >  create mode 100644 drivers/crypto/aspeed/Kconfig
> >  create mode 100644 drivers/crypto/aspeed/Makefile
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index f55aea311af5..23a0215b7e42 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3140,6 +3140,13 @@ S:	Maintained
> >  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >  F:	drivers/media/platform/aspeed/
> >
> > +ASPEED CRYPTO DRIVER
> > +M:	Neal Liu <neal_liu@aspeedtech.com>
> > +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> > +S:	Maintained
> > +F:
> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> > +F:	drivers/crypto/aspeed/
> > +
> >  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >  M:	Corentin Chary <corentin.chary@gmail.com>
> >  L:	acpi4asus-user@lists.sourceforge.net
> > diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> > index ee99c02c84e8..b9f5ee126881 100644
> > --- a/drivers/crypto/Kconfig
> > +++ b/drivers/crypto/Kconfig
> > @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >  	  acceleration for cryptographic algorithms on these devices.
> >
> >  source "drivers/crypto/keembay/Kconfig"
> > +source "drivers/crypto/aspeed/Kconfig"
> >
> >  endif # CRYPTO_HW
> > diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> > index f81703a86b98..116de173a66c 100644
> > --- a/drivers/crypto/Makefile
> > +++ b/drivers/crypto/Makefile
> > @@ -1,5 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
> > diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> > new file mode 100644
> > index 000000000000..059e627efef8
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/Kconfig
> > @@ -0,0 +1,32 @@
> > +config CRYPTO_DEV_ASPEED
> > +	tristate "Support for Aspeed cryptographic engine driver"
> > +	depends on ARCH_ASPEED
> > +	help
> > +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> > +	  throughput of hash data digest, encryption and decryption.
> > +
> > +	  Select y here to have support for the cryptographic driver
> > +	  available on Aspeed SoC.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_HASH
> > +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> > +	depends on CRYPTO_DEV_ASPEED
> > +	select CRYPTO_ENGINE
> > +	select CRYPTO_SHA1
> > +	select CRYPTO_SHA256
> > +	select CRYPTO_SHA512
> > +	select CRYPTO_HMAC
> > +	help
> > +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> > +	  hash driver.
> > +	  Supports multiple message digest standards, including
> > +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> > +	bool "Enable HACE hash debug messages"
> > +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> > +	help
> > +	  Print HACE hash debugging messages if you use this option
> > +	  to ask for those messages.
> > +	  Avoid enabling this option for production build to
> > +	  minimize driver timing.
> > diff --git a/drivers/crypto/aspeed/Makefile
> b/drivers/crypto/aspeed/Makefile
> > new file mode 100644
> > index 000000000000..8bc8d4fed5a9
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/Makefile
> > @@ -0,0 +1,6 @@
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> > +aspeed_crypto-objs := aspeed-hace.o \
> > +		      $(hace-hash-y)
> > +
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
> > +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> aspeed-hace-hash.o
> > diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> > new file mode 100644
> > index 000000000000..63a8ad694996
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> > @@ -0,0 +1,1389 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * Copyright (c) 2021 Aspeed Technology Inc.
> > + */
> > +
> > +#include "aspeed-hace.h"
> > +
> > +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> > +#define AHASH_DBG(h, fmt, ...)	\
> > +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> > +#else
> > +#define AHASH_DBG(h, fmt, ...)	\
> > +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> > +#endif
> > +
> > +/* Initialization Vectors for SHA-family */
> > +static const __be32 sha1_iv[8] = {
> > +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
> > +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
> > +	cpu_to_be32(SHA1_H4), 0, 0, 0
> > +};
> > +
> > +static const __be32 sha224_iv[8] = {
> > +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
> > +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
> > +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
> > +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
> > +};
> > +
> > +static const __be32 sha256_iv[8] = {
> > +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
> > +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
> > +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
> > +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
> > +};
> > +
> > +static const __be64 sha384_iv[8] = {
> > +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
> > +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
> > +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
> > +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
> > +};
> > +
> > +static const __be64 sha512_iv[8] = {
> > +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
> > +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
> > +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
> > +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
> > +};
> > +
> > +static const __be32 sha512_224_iv[16] = {
> > +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
> > +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
> > +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
> > +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
> > +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
> > +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
> > +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
> > +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
> > +};
> > +
> > +static const __be32 sha512_256_iv[16] = {
> > +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
> > +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
> > +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
> > +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
> > +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
> > +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
> > +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
> > +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
> > +};
> > +
> > +/* The purpose of this padding is to ensure that the padded message is a
> > + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits
> (SHA384/SHA512).
> > + * The bit "1" is appended at the end of the message followed by
> > + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
> > + * 128 bits block (SHA384/SHA512) equals to the message length in bits
> > + * is appended.
> > + *
> > + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
> > + *  - if message length < 56 bytes then padlen = 56 - message length
> > + *  - else padlen = 64 + 56 - message length
> > + *
> > + * For SHA384/SHA512, padlen is calculated as followed:
> > + *  - if message length < 112 bytes then padlen = 112 - message length
> > + *  - else padlen = 128 + 112 - message length
> > + */
> > +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
> > +				      struct aspeed_sham_reqctx *rctx)
> > +{
> > +	unsigned int index, padlen;
> > +	__be64 bits[2];
> > +
> > +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
> > +
> > +	switch (rctx->flags & SHA_FLAGS_MASK) {
> > +	case SHA_FLAGS_SHA1:
> > +	case SHA_FLAGS_SHA224:
> > +	case SHA_FLAGS_SHA256:
> > +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
> > +		index = rctx->bufcnt & 0x3f;
> > +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
> > +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> > +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> > +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
> > +		rctx->bufcnt += padlen + 8;
> > +		break;
> > +	default:
> > +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
> > +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
> > +				      rctx->digcnt[0] >> 61);
> > +		index = rctx->bufcnt & 0x7f;
> > +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
> > +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> > +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> > +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
> > +		rctx->bufcnt += padlen + 16;
> > +		break;
> > +	}
> > +}
> > +
> > +/*
> > + * Prepare DMA buffer before hardware engine
> > + * processing.
> > + */
> > +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +	int length, remain;
> > +
> > +	length = rctx->total + rctx->bufcnt;
> > +	remain = length % rctx->block_size;
> > +
> > +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
> > +
> > +	if (rctx->bufcnt)
> > +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
> > +
> > +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
> > +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
> > +					 rctx->bufcnt, rctx->src_sg,
> > +					 rctx->offset, rctx->total - remain, 0);
> > +		rctx->offset += rctx->total - remain;
> > +
> > +	} else {
> > +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
> > +				 rctx->offset, remain, 0);
> > +
> > +	rctx->bufcnt = remain;
> > +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> > +					       SHA512_DIGEST_SIZE,
> > +					       DMA_BIDIRECTIONAL);
> > +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> > +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	hash_engine->src_length = length - remain;
> > +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> > +	hash_engine->digest_dma = rctx->digest_dma_addr;
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Prepare DMA buffer as SG list buffer before
> > + * hardware engine processing.
> > + */
> > +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +	struct aspeed_sg_list *src_list;
> > +	struct scatterlist *s;
> > +	int length, remain, sg_len, i;
> > +	int rc = 0;
> > +
> > +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> > +	length = rctx->total + rctx->bufcnt - remain;
> > +
> > +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> > +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> > +		  "length", length, "remain", remain);
> > +
> > +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> > +			    DMA_TO_DEVICE);
> > +	if (!sg_len) {
> > +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> > +		rc = -ENOMEM;
> > +		goto end;
> > +	}
> > +
> > +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> > +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> > +					       SHA512_DIGEST_SIZE,
> > +					       DMA_BIDIRECTIONAL);
> > +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> > +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> > +		rc = -ENOMEM;
> > +		goto free_src_sg;
> > +	}
> > +
> > +	if (rctx->bufcnt != 0) {
> > +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> > +						       rctx->buffer,
> > +						       rctx->block_size * 2,
> > +						       DMA_TO_DEVICE);
> > +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> > +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> > +			rc = -ENOMEM;
> > +			goto free_rctx_digest;
> > +		}
> > +
> > +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> > +		src_list[0].len = rctx->bufcnt;
> > +		length -= src_list[0].len;
> > +
> > +		/* Last sg list */
> > +		if (length == 0)
> > +			src_list[0].len |= HASH_SG_LAST_LIST;
> > +
> > +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> > +		src_list[0].len = cpu_to_le32(src_list[0].len);
> > +		src_list++;
> > +	}
> > +
> > +	if (length != 0) {
> > +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> > +			src_list[i].phy_addr = sg_dma_address(s);
> > +
> > +			if (length > sg_dma_len(s)) {
> > +				src_list[i].len = sg_dma_len(s);
> > +				length -= sg_dma_len(s);
> > +
> > +			} else {
> > +				/* Last sg list */
> > +				src_list[i].len = length;
> > +				src_list[i].len |= HASH_SG_LAST_LIST;
> > +				length = 0;
> > +			}
> > +
> > +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> > +			src_list[i].len = cpu_to_le32(src_list[i].len);
> > +		}
> > +	}
> > +
> > +	if (length != 0) {
> > +		rc = -EINVAL;
> > +		goto free_rctx_buffer;
> > +	}
> > +
> > +	rctx->offset = rctx->total - remain;
> > +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> > +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> > +	hash_engine->digest_dma = rctx->digest_dma_addr;
> > +
> > +	goto end;
> Exiting via "goto xx" is not recommended in normal code logic (this requires
> two jumps),
> exiting via "return 0" is more efficient.
> This code method has many times in your entire driver, it is recommended to
> modify it.

If not exiting via "goto xx", how to release related resources without any problem?
Is there any proper way to do this?

> > +
> > +free_rctx_buffer:
> > +	if (rctx->bufcnt != 0)
> > +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> > +				 rctx->block_size * 2, DMA_TO_DEVICE);
> > +free_rctx_digest:
> > +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> > +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> > +free_src_sg:
> > +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> > +		     DMA_TO_DEVICE);
> > +end:
> > +	return rc;
> > +}
> > +
> > +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +
> > +	AHASH_DBG(hace_dev, "\n");
> > +
> > +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> > +
> > +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Copy digest to the corresponding request result.
> > + * This function will be called at final() stage.
> > + */
> > +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +
> > +	AHASH_DBG(hace_dev, "\n");
> > +
> > +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> > +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> > +
> > +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> > +			 rctx->block_size * 2, DMA_TO_DEVICE);
> > +
> > +	memcpy(req->result, rctx->digest, rctx->digsize);
> > +
> > +	return aspeed_ahash_complete(hace_dev);
> > +}
> > +
> > +/*
> > + * Trigger hardware engines to do the math.
> > + */
> > +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
> > +				     aspeed_hace_fn_t resume)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +
> > +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> length:0x%x\n",
> > +		  hash_engine->src_dma, hash_engine->digest_dma,
> > +		  hash_engine->src_length);
> > +
> > +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> > +	hash_engine->resume = resume;
> > +
> > +	ast_hace_write(hace_dev, hash_engine->src_dma,
> ASPEED_HACE_HASH_SRC);
> > +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> > +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> > +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> > +		       ASPEED_HACE_HASH_KEY_BUFF);
> > +	ast_hace_write(hace_dev, hash_engine->src_length,
> > +		       ASPEED_HACE_HASH_DATA_LEN);
> > +
> > +	/* Memory barrier to ensure all data setup before engine starts */
> > +	mb();
> > +
> > +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> A hardware service sending requires 5 hardware commands to complete.
> In a multi-concurrency scenario, how to ensure the order of commands?
> (If two processes send hardware task at the same time,
> How to ensure that the hardware recognizes which task the current
> command belongs to?)

Linux crypto engine would guarantee that only one request at each time to be dequeued from engine queue to process.
And there has lock mechanism inside Linux crypto engine to prevent the scenario you mentioned.
So only 1 aspeed_hace_ahash_trigger() hardware service would go through at a time.

[...]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-08  9:30       ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-08  9:30 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> -----Original Message-----
> From: liulongfang <liulongfang@huawei.com>
> Sent: Monday, August 8, 2022 10:53 AM
> To: Neal Liu <neal_liu@aspeedtech.com>; Corentin Labbe
> <clabbe.montjoie@gmail.com>; Christophe JAILLET
> <christophe.jaillet@wanadoo.fr>; Randy Dunlap <rdunlap@infradead.org>;
> Herbert Xu <herbert@gondor.apana.org.au>; David S . Miller
> <davem@davemloft.net>; Rob Herring <robh+dt@kernel.org>; Krzysztof
> Kozlowski <krzysztof.kozlowski+dt@linaro.org>; Joel Stanley <joel@jms.id.au>;
> Andrew Jeffery <andrew@aj.id.au>; Dhananjay Phadke
> <dhphadke@microsoft.com>; Johnny Huang
> <johnny_huang@aspeedtech.com>
> Cc: linux-aspeed@lists.ozlabs.org; linux-crypto@vger.kernel.org;
> devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> linux-kernel@vger.kernel.org; BMC-SW <BMC-SW@aspeedtech.com>
> Subject: Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
> 
> 
> On 2022/7/26 19:34, Neal Liu wrote:
> > Hash and Crypto Engine (HACE) is designed to accelerate the
> > throughput of hash data digest, encryption, and decryption.
> >
> > Basically, HACE can be divided into two independently engines
> > - Hash Engine and Crypto Engine. This patch aims to add HACE
> > hash engine driver for hash accelerator.
> >
> > Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> > Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> > ---
> >  MAINTAINERS                              |    7 +
> >  drivers/crypto/Kconfig                   |    1 +
> >  drivers/crypto/Makefile                  |    1 +
> >  drivers/crypto/aspeed/Kconfig            |   32 +
> >  drivers/crypto/aspeed/Makefile           |    6 +
> >  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> ++++++++++++++++++++++
> >  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >  8 files changed, 1835 insertions(+)
> >  create mode 100644 drivers/crypto/aspeed/Kconfig
> >  create mode 100644 drivers/crypto/aspeed/Makefile
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index f55aea311af5..23a0215b7e42 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3140,6 +3140,13 @@ S:	Maintained
> >  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >  F:	drivers/media/platform/aspeed/
> >
> > +ASPEED CRYPTO DRIVER
> > +M:	Neal Liu <neal_liu@aspeedtech.com>
> > +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> > +S:	Maintained
> > +F:
> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> > +F:	drivers/crypto/aspeed/
> > +
> >  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >  M:	Corentin Chary <corentin.chary@gmail.com>
> >  L:	acpi4asus-user@lists.sourceforge.net
> > diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> > index ee99c02c84e8..b9f5ee126881 100644
> > --- a/drivers/crypto/Kconfig
> > +++ b/drivers/crypto/Kconfig
> > @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >  	  acceleration for cryptographic algorithms on these devices.
> >
> >  source "drivers/crypto/keembay/Kconfig"
> > +source "drivers/crypto/aspeed/Kconfig"
> >
> >  endif # CRYPTO_HW
> > diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> > index f81703a86b98..116de173a66c 100644
> > --- a/drivers/crypto/Makefile
> > +++ b/drivers/crypto/Makefile
> > @@ -1,5 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
> > diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> > new file mode 100644
> > index 000000000000..059e627efef8
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/Kconfig
> > @@ -0,0 +1,32 @@
> > +config CRYPTO_DEV_ASPEED
> > +	tristate "Support for Aspeed cryptographic engine driver"
> > +	depends on ARCH_ASPEED
> > +	help
> > +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> > +	  throughput of hash data digest, encryption and decryption.
> > +
> > +	  Select y here to have support for the cryptographic driver
> > +	  available on Aspeed SoC.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_HASH
> > +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> > +	depends on CRYPTO_DEV_ASPEED
> > +	select CRYPTO_ENGINE
> > +	select CRYPTO_SHA1
> > +	select CRYPTO_SHA256
> > +	select CRYPTO_SHA512
> > +	select CRYPTO_HMAC
> > +	help
> > +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> > +	  hash driver.
> > +	  Supports multiple message digest standards, including
> > +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> > +
> > +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> > +	bool "Enable HACE hash debug messages"
> > +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> > +	help
> > +	  Print HACE hash debugging messages if you use this option
> > +	  to ask for those messages.
> > +	  Avoid enabling this option for production build to
> > +	  minimize driver timing.
> > diff --git a/drivers/crypto/aspeed/Makefile
> b/drivers/crypto/aspeed/Makefile
> > new file mode 100644
> > index 000000000000..8bc8d4fed5a9
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/Makefile
> > @@ -0,0 +1,6 @@
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> > +aspeed_crypto-objs := aspeed-hace.o \
> > +		      $(hace-hash-y)
> > +
> > +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
> > +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> aspeed-hace-hash.o
> > diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> > new file mode 100644
> > index 000000000000..63a8ad694996
> > --- /dev/null
> > +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> > @@ -0,0 +1,1389 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * Copyright (c) 2021 Aspeed Technology Inc.
> > + */
> > +
> > +#include "aspeed-hace.h"
> > +
> > +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> > +#define AHASH_DBG(h, fmt, ...)	\
> > +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> > +#else
> > +#define AHASH_DBG(h, fmt, ...)	\
> > +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
> > +#endif
> > +
> > +/* Initialization Vectors for SHA-family */
> > +static const __be32 sha1_iv[8] = {
> > +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
> > +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
> > +	cpu_to_be32(SHA1_H4), 0, 0, 0
> > +};
> > +
> > +static const __be32 sha224_iv[8] = {
> > +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
> > +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
> > +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
> > +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
> > +};
> > +
> > +static const __be32 sha256_iv[8] = {
> > +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
> > +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
> > +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
> > +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
> > +};
> > +
> > +static const __be64 sha384_iv[8] = {
> > +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
> > +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
> > +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
> > +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
> > +};
> > +
> > +static const __be64 sha512_iv[8] = {
> > +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
> > +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
> > +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
> > +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
> > +};
> > +
> > +static const __be32 sha512_224_iv[16] = {
> > +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
> > +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
> > +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
> > +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
> > +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
> > +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
> > +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
> > +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
> > +};
> > +
> > +static const __be32 sha512_256_iv[16] = {
> > +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
> > +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
> > +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
> > +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
> > +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
> > +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
> > +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
> > +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
> > +};
> > +
> > +/* The purpose of this padding is to ensure that the padded message is a
> > + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits
> (SHA384/SHA512).
> > + * The bit "1" is appended at the end of the message followed by
> > + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
> > + * 128 bits block (SHA384/SHA512) equals to the message length in bits
> > + * is appended.
> > + *
> > + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
> > + *  - if message length < 56 bytes then padlen = 56 - message length
> > + *  - else padlen = 64 + 56 - message length
> > + *
> > + * For SHA384/SHA512, padlen is calculated as followed:
> > + *  - if message length < 112 bytes then padlen = 112 - message length
> > + *  - else padlen = 128 + 112 - message length
> > + */
> > +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
> > +				      struct aspeed_sham_reqctx *rctx)
> > +{
> > +	unsigned int index, padlen;
> > +	__be64 bits[2];
> > +
> > +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
> > +
> > +	switch (rctx->flags & SHA_FLAGS_MASK) {
> > +	case SHA_FLAGS_SHA1:
> > +	case SHA_FLAGS_SHA224:
> > +	case SHA_FLAGS_SHA256:
> > +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
> > +		index = rctx->bufcnt & 0x3f;
> > +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
> > +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> > +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> > +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
> > +		rctx->bufcnt += padlen + 8;
> > +		break;
> > +	default:
> > +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
> > +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
> > +				      rctx->digcnt[0] >> 61);
> > +		index = rctx->bufcnt & 0x7f;
> > +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
> > +		*(rctx->buffer + rctx->bufcnt) = 0x80;
> > +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
> > +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
> > +		rctx->bufcnt += padlen + 16;
> > +		break;
> > +	}
> > +}
> > +
> > +/*
> > + * Prepare DMA buffer before hardware engine
> > + * processing.
> > + */
> > +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +	int length, remain;
> > +
> > +	length = rctx->total + rctx->bufcnt;
> > +	remain = length % rctx->block_size;
> > +
> > +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
> > +
> > +	if (rctx->bufcnt)
> > +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
> > +
> > +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
> > +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
> > +					 rctx->bufcnt, rctx->src_sg,
> > +					 rctx->offset, rctx->total - remain, 0);
> > +		rctx->offset += rctx->total - remain;
> > +
> > +	} else {
> > +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
> > +				 rctx->offset, remain, 0);
> > +
> > +	rctx->bufcnt = remain;
> > +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> > +					       SHA512_DIGEST_SIZE,
> > +					       DMA_BIDIRECTIONAL);
> > +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> > +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	hash_engine->src_length = length - remain;
> > +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> > +	hash_engine->digest_dma = rctx->digest_dma_addr;
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Prepare DMA buffer as SG list buffer before
> > + * hardware engine processing.
> > + */
> > +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +	struct aspeed_sg_list *src_list;
> > +	struct scatterlist *s;
> > +	int length, remain, sg_len, i;
> > +	int rc = 0;
> > +
> > +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> > +	length = rctx->total + rctx->bufcnt - remain;
> > +
> > +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> > +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> > +		  "length", length, "remain", remain);
> > +
> > +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> > +			    DMA_TO_DEVICE);
> > +	if (!sg_len) {
> > +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> > +		rc = -ENOMEM;
> > +		goto end;
> > +	}
> > +
> > +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> > +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
> > +					       SHA512_DIGEST_SIZE,
> > +					       DMA_BIDIRECTIONAL);
> > +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> > +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> > +		rc = -ENOMEM;
> > +		goto free_src_sg;
> > +	}
> > +
> > +	if (rctx->bufcnt != 0) {
> > +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> > +						       rctx->buffer,
> > +						       rctx->block_size * 2,
> > +						       DMA_TO_DEVICE);
> > +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
> > +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
> > +			rc = -ENOMEM;
> > +			goto free_rctx_digest;
> > +		}
> > +
> > +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> > +		src_list[0].len = rctx->bufcnt;
> > +		length -= src_list[0].len;
> > +
> > +		/* Last sg list */
> > +		if (length == 0)
> > +			src_list[0].len |= HASH_SG_LAST_LIST;
> > +
> > +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> > +		src_list[0].len = cpu_to_le32(src_list[0].len);
> > +		src_list++;
> > +	}
> > +
> > +	if (length != 0) {
> > +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> > +			src_list[i].phy_addr = sg_dma_address(s);
> > +
> > +			if (length > sg_dma_len(s)) {
> > +				src_list[i].len = sg_dma_len(s);
> > +				length -= sg_dma_len(s);
> > +
> > +			} else {
> > +				/* Last sg list */
> > +				src_list[i].len = length;
> > +				src_list[i].len |= HASH_SG_LAST_LIST;
> > +				length = 0;
> > +			}
> > +
> > +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> > +			src_list[i].len = cpu_to_le32(src_list[i].len);
> > +		}
> > +	}
> > +
> > +	if (length != 0) {
> > +		rc = -EINVAL;
> > +		goto free_rctx_buffer;
> > +	}
> > +
> > +	rctx->offset = rctx->total - remain;
> > +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> > +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> > +	hash_engine->digest_dma = rctx->digest_dma_addr;
> > +
> > +	goto end;
> Exiting via "goto xx" is not recommended in normal code logic (this requires
> two jumps),
> exiting via "return 0" is more efficient.
> This code method has many times in your entire driver, it is recommended to
> modify it.

If not exiting via "goto xx", how to release related resources without any problem?
Is there any proper way to do this?

> > +
> > +free_rctx_buffer:
> > +	if (rctx->bufcnt != 0)
> > +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> > +				 rctx->block_size * 2, DMA_TO_DEVICE);
> > +free_rctx_digest:
> > +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> > +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> > +free_src_sg:
> > +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> > +		     DMA_TO_DEVICE);
> > +end:
> > +	return rc;
> > +}
> > +
> > +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +
> > +	AHASH_DBG(hace_dev, "\n");
> > +
> > +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> > +
> > +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Copy digest to the corresponding request result.
> > + * This function will be called at final() stage.
> > + */
> > +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +
> > +	AHASH_DBG(hace_dev, "\n");
> > +
> > +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> > +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> > +
> > +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> > +			 rctx->block_size * 2, DMA_TO_DEVICE);
> > +
> > +	memcpy(req->result, rctx->digest, rctx->digsize);
> > +
> > +	return aspeed_ahash_complete(hace_dev);
> > +}
> > +
> > +/*
> > + * Trigger hardware engines to do the math.
> > + */
> > +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
> > +				     aspeed_hace_fn_t resume)
> > +{
> > +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
> > +	struct ahash_request *req = hash_engine->req;
> > +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> > +
> > +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> length:0x%x\n",
> > +		  hash_engine->src_dma, hash_engine->digest_dma,
> > +		  hash_engine->src_length);
> > +
> > +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> > +	hash_engine->resume = resume;
> > +
> > +	ast_hace_write(hace_dev, hash_engine->src_dma,
> ASPEED_HACE_HASH_SRC);
> > +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> > +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> > +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> > +		       ASPEED_HACE_HASH_KEY_BUFF);
> > +	ast_hace_write(hace_dev, hash_engine->src_length,
> > +		       ASPEED_HACE_HASH_DATA_LEN);
> > +
> > +	/* Memory barrier to ensure all data setup before engine starts */
> > +	mb();
> > +
> > +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> A hardware service sending requires 5 hardware commands to complete.
> In a multi-concurrency scenario, how to ensure the order of commands?
> (If two processes send hardware task at the same time,
> How to ensure that the hardware recognizes which task the current
> command belongs to?)

Linux crypto engine would guarantee that only one request at each time to be dequeued from engine queue to process.
And there has lock mechanism inside Linux crypto engine to prevent the scenario you mentioned.
So only 1 aspeed_hace_ahash_trigger() hardware service would go through at a time.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-08-08  9:30       ` Neal Liu
@ 2022-08-08 11:49         ` liulongfang
  -1 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-08 11:49 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 2022/8/8 17:30, Neal Liu wrote:
>> -----Original Message-----
>> From: liulongfang <liulongfang@huawei.com>
>> Sent: Monday, August 8, 2022 10:53 AM
>> To: Neal Liu <neal_liu@aspeedtech.com>; Corentin Labbe
>> <clabbe.montjoie@gmail.com>; Christophe JAILLET
>> <christophe.jaillet@wanadoo.fr>; Randy Dunlap <rdunlap@infradead.org>;
>> Herbert Xu <herbert@gondor.apana.org.au>; David S . Miller
>> <davem@davemloft.net>; Rob Herring <robh+dt@kernel.org>; Krzysztof
>> Kozlowski <krzysztof.kozlowski+dt@linaro.org>; Joel Stanley <joel@jms.id.au>;
>> Andrew Jeffery <andrew@aj.id.au>; Dhananjay Phadke
>> <dhphadke@microsoft.com>; Johnny Huang
>> <johnny_huang@aspeedtech.com>
>> Cc: linux-aspeed@lists.ozlabs.org; linux-crypto@vger.kernel.org;
>> devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> linux-kernel@vger.kernel.org; BMC-SW <BMC-SW@aspeedtech.com>
>> Subject: Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
>>
>>
>> On 2022/7/26 19:34, Neal Liu wrote:
>>> Hash and Crypto Engine (HACE) is designed to accelerate the
>>> throughput of hash data digest, encryption, and decryption.
>>>
>>> Basically, HACE can be divided into two independently engines
>>> - Hash Engine and Crypto Engine. This patch aims to add HACE
>>> hash engine driver for hash accelerator.
>>>
>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
>>> ---
>>>  MAINTAINERS                              |    7 +
>>>  drivers/crypto/Kconfig                   |    1 +
>>>  drivers/crypto/Makefile                  |    1 +
>>>  drivers/crypto/aspeed/Kconfig            |   32 +
>>>  drivers/crypto/aspeed/Makefile           |    6 +
>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
>> ++++++++++++++++++++++
>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>>>  8 files changed, 1835 insertions(+)
>>>  create mode 100644 drivers/crypto/aspeed/Kconfig
>>>  create mode 100644 drivers/crypto/aspeed/Makefile
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index f55aea311af5..23a0215b7e42 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3140,6 +3140,13 @@ S:	Maintained
>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>>>  F:	drivers/media/platform/aspeed/
>>>
>>> +ASPEED CRYPTO DRIVER
>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>>> +S:	Maintained
>>> +F:
>> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
>>> +F:	drivers/crypto/aspeed/
>>> +
>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>>  M:	Corentin Chary <corentin.chary@gmail.com>
>>>  L:	acpi4asus-user@lists.sourceforge.net
>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
>>> index ee99c02c84e8..b9f5ee126881 100644
>>> --- a/drivers/crypto/Kconfig
>>> +++ b/drivers/crypto/Kconfig
>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>>>  	  acceleration for cryptographic algorithms on these devices.
>>>
>>>  source "drivers/crypto/keembay/Kconfig"
>>> +source "drivers/crypto/aspeed/Kconfig"
>>>
>>>  endif # CRYPTO_HW
>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
>>> index f81703a86b98..116de173a66c 100644
>>> --- a/drivers/crypto/Makefile
>>> +++ b/drivers/crypto/Makefile
>>> @@ -1,5 +1,6 @@
>>>  # SPDX-License-Identifier: GPL-2.0
>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
>>> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
>>> new file mode 100644
>>> index 000000000000..059e627efef8
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Kconfig
>>> @@ -0,0 +1,32 @@
>>> +config CRYPTO_DEV_ASPEED
>>> +	tristate "Support for Aspeed cryptographic engine driver"
>>> +	depends on ARCH_ASPEED
>>> +	help
>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
>>> +	  throughput of hash data digest, encryption and decryption.
>>> +
>>> +	  Select y here to have support for the cryptographic driver
>>> +	  available on Aspeed SoC.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
>>> +	depends on CRYPTO_DEV_ASPEED
>>> +	select CRYPTO_ENGINE
>>> +	select CRYPTO_SHA1
>>> +	select CRYPTO_SHA256
>>> +	select CRYPTO_SHA512
>>> +	select CRYPTO_HMAC
>>> +	help
>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
>>> +	  hash driver.
>>> +	  Supports multiple message digest standards, including
>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> +	bool "Enable HACE hash debug messages"
>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
>>> +	help
>>> +	  Print HACE hash debugging messages if you use this option
>>> +	  to ask for those messages.
>>> +	  Avoid enabling this option for production build to
>>> +	  minimize driver timing.
>>> diff --git a/drivers/crypto/aspeed/Makefile
>> b/drivers/crypto/aspeed/Makefile
>>> new file mode 100644
>>> index 000000000000..8bc8d4fed5a9
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Makefile
>>> @@ -0,0 +1,6 @@
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
>>> +aspeed_crypto-objs := aspeed-hace.o \
>>> +		      $(hace-hash-y)
>>> +
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
>> aspeed-hace-hash.o
>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> new file mode 100644
>>> index 000000000000..63a8ad694996
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> @@ -0,0 +1,1389 @@
>>> +// SPDX-License-Identifier: GPL-2.0+
>>> +/*
>>> + * Copyright (c) 2021 Aspeed Technology Inc.
>>> + */
>>> +
>>> +#include "aspeed-hace.h"
>>> +
>>> +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> +#define AHASH_DBG(h, fmt, ...)	\
>>> +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#else
>>> +#define AHASH_DBG(h, fmt, ...)	\
>>> +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#endif
>>> +
>>> +/* Initialization Vectors for SHA-family */
>>> +static const __be32 sha1_iv[8] = {
>>> +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
>>> +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
>>> +	cpu_to_be32(SHA1_H4), 0, 0, 0
>>> +};
>>> +
>>> +static const __be32 sha224_iv[8] = {
>>> +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
>>> +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
>>> +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
>>> +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
>>> +};
>>> +
>>> +static const __be32 sha256_iv[8] = {
>>> +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
>>> +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
>>> +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
>>> +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
>>> +};
>>> +
>>> +static const __be64 sha384_iv[8] = {
>>> +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
>>> +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
>>> +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
>>> +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
>>> +};
>>> +
>>> +static const __be64 sha512_iv[8] = {
>>> +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
>>> +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
>>> +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
>>> +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
>>> +};
>>> +
>>> +static const __be32 sha512_224_iv[16] = {
>>> +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
>>> +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
>>> +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
>>> +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
>>> +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
>>> +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
>>> +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
>>> +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
>>> +};
>>> +
>>> +static const __be32 sha512_256_iv[16] = {
>>> +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
>>> +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
>>> +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
>>> +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
>>> +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
>>> +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
>>> +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
>>> +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
>>> +};
>>> +
>>> +/* The purpose of this padding is to ensure that the padded message is a
>>> + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits
>> (SHA384/SHA512).
>>> + * The bit "1" is appended at the end of the message followed by
>>> + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
>>> + * 128 bits block (SHA384/SHA512) equals to the message length in bits
>>> + * is appended.
>>> + *
>>> + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
>>> + *  - if message length < 56 bytes then padlen = 56 - message length
>>> + *  - else padlen = 64 + 56 - message length
>>> + *
>>> + * For SHA384/SHA512, padlen is calculated as followed:
>>> + *  - if message length < 112 bytes then padlen = 112 - message length
>>> + *  - else padlen = 128 + 112 - message length
>>> + */
>>> +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
>>> +				      struct aspeed_sham_reqctx *rctx)
>>> +{
>>> +	unsigned int index, padlen;
>>> +	__be64 bits[2];
>>> +
>>> +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
>>> +
>>> +	switch (rctx->flags & SHA_FLAGS_MASK) {
>>> +	case SHA_FLAGS_SHA1:
>>> +	case SHA_FLAGS_SHA224:
>>> +	case SHA_FLAGS_SHA256:
>>> +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> +		index = rctx->bufcnt & 0x3f;
>>> +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
>>> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
>>> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
>>> +		rctx->bufcnt += padlen + 8;
>>> +		break;
>>> +	default:
>>> +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
>>> +				      rctx->digcnt[0] >> 61);
>>> +		index = rctx->bufcnt & 0x7f;
>>> +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
>>> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
>>> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
>>> +		rctx->bufcnt += padlen + 16;
>>> +		break;
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer before hardware engine
>>> + * processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +	int length, remain;
>>> +
>>> +	length = rctx->total + rctx->bufcnt;
>>> +	remain = length % rctx->block_size;
>>> +
>>> +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
>>> +
>>> +	if (rctx->bufcnt)
>>> +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
>>> +
>>> +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
>>> +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
>>> +					 rctx->bufcnt, rctx->src_sg,
>>> +					 rctx->offset, rctx->total - remain, 0);
>>> +		rctx->offset += rctx->total - remain;
>>> +
>>> +	} else {
>>> +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
>>> +				 rctx->offset, remain, 0);
>>> +
>>> +	rctx->bufcnt = remain;
>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> +					       SHA512_DIGEST_SIZE,
>>> +					       DMA_BIDIRECTIONAL);
>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	hash_engine->src_length = length - remain;
>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer as SG list buffer before
>>> + * hardware engine processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
>> *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +	struct aspeed_sg_list *src_list;
>>> +	struct scatterlist *s;
>>> +	int length, remain, sg_len, i;
>>> +	int rc = 0;
>>> +
>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
>>> +	length = rctx->total + rctx->bufcnt - remain;
>>> +
>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
>>> +		  "length", length, "remain", remain);
>>> +
>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> +			    DMA_TO_DEVICE);
>>> +	if (!sg_len) {
>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
>>> +		rc = -ENOMEM;
>>> +		goto end;
>>> +	}
>>> +
>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> +					       SHA512_DIGEST_SIZE,
>>> +					       DMA_BIDIRECTIONAL);
>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> +		rc = -ENOMEM;
>>> +		goto free_src_sg;
>>> +	}
>>> +
>>> +	if (rctx->bufcnt != 0) {
>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
>>> +						       rctx->buffer,
>>> +						       rctx->block_size * 2,
>>> +						       DMA_TO_DEVICE);
>>> +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
>>> +			rc = -ENOMEM;
>>> +			goto free_rctx_digest;
>>> +		}
>>> +
>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
>>> +		src_list[0].len = rctx->bufcnt;
>>> +		length -= src_list[0].len;
>>> +
>>> +		/* Last sg list */
>>> +		if (length == 0)
>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
>>> +
>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
>>> +		src_list++;
>>> +	}
>>> +
>>> +	if (length != 0) {
>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
>>> +			src_list[i].phy_addr = sg_dma_address(s);
>>> +
>>> +			if (length > sg_dma_len(s)) {
>>> +				src_list[i].len = sg_dma_len(s);
>>> +				length -= sg_dma_len(s);
>>> +
>>> +			} else {
>>> +				/* Last sg list */
>>> +				src_list[i].len = length;
>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
>>> +				length = 0;
>>> +			}
>>> +
>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
>>> +		}
>>> +	}
>>> +
>>> +	if (length != 0) {
>>> +		rc = -EINVAL;
>>> +		goto free_rctx_buffer;
>>> +	}
>>> +
>>> +	rctx->offset = rctx->total - remain;
>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> +	goto end;
>> Exiting via "goto xx" is not recommended in normal code logic (this requires
>> two jumps),
>> exiting via "return 0" is more efficient.
>> This code method has many times in your entire driver, it is recommended to
>> modify it.
> 
> If not exiting via "goto xx", how to release related resources without any problem?
> Is there any proper way to do this?
maybe I didn't describe it clearly enough.
"in normal code logic"  means rc=0
In this scenario (rc=0), "goto xx" is no longer required,
it can be replaced with "return 0"
> 
>>> +
>>> +free_rctx_buffer:
>>> +	if (rctx->bufcnt != 0)
>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
>>> +free_rctx_digest:
>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +free_src_sg:
>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> +		     DMA_TO_DEVICE);
>>> +end:
>>> +	return rc;
>>> +}
>>> +
>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +
>>> +	AHASH_DBG(hace_dev, "\n");
>>> +
>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
>>> +
>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Copy digest to the corresponding request result.
>>> + * This function will be called at final() stage.
>>> + */
>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> +	AHASH_DBG(hace_dev, "\n");
>>> +
>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +
>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
>>> +
>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
>>> +
>>> +	return aspeed_ahash_complete(hace_dev);
>>> +}
>>> +
>>> +/*
>>> + * Trigger hardware engines to do the math.
>>> + */
>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
>>> +				     aspeed_hace_fn_t resume)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
>> length:0x%x\n",
>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
>>> +		  hash_engine->src_length);
>>> +
>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
>>> +	hash_engine->resume = resume;
>>> +
>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
>> ASPEED_HACE_HASH_SRC);
>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
>>> +		       ASPEED_HACE_HASH_DATA_LEN);
>>> +
>>> +	/* Memory barrier to ensure all data setup before engine starts */
>>> +	mb();
>>> +
>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
>> A hardware service sending requires 5 hardware commands to complete.
>> In a multi-concurrency scenario, how to ensure the order of commands?
>> (If two processes send hardware task at the same time,
>> How to ensure that the hardware recognizes which task the current
>> command belongs to?)
> 
> Linux crypto engine would guarantee that only one request at each time to be dequeued from engine queue to process.
> And there has lock mechanism inside Linux crypto engine to prevent the scenario you mentioned.
> So only 1 aspeed_hace_ahash_trigger() hardware service would go through at a time.
> 
> [...]
> .
> 
You may not understand what I mean, the command flow in a normal scenario:
request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
In a multi-process concurrent scenario, multiple crypto engines can be enabled,
and each crypto engine sends a request. If multiple requests here enter
aspeed_hace_ahash_trigger() at the same time, the command flow will be
intertwined like this:
request_A, request_B: Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5

In this command flow, how does your hardware identify whether these commands
belong to request_A or request_B?
Thanks.
Longfang.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-08 11:49         ` liulongfang
  0 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-08 11:49 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 2022/8/8 17:30, Neal Liu wrote:
>> -----Original Message-----
>> From: liulongfang <liulongfang@huawei.com>
>> Sent: Monday, August 8, 2022 10:53 AM
>> To: Neal Liu <neal_liu@aspeedtech.com>; Corentin Labbe
>> <clabbe.montjoie@gmail.com>; Christophe JAILLET
>> <christophe.jaillet@wanadoo.fr>; Randy Dunlap <rdunlap@infradead.org>;
>> Herbert Xu <herbert@gondor.apana.org.au>; David S . Miller
>> <davem@davemloft.net>; Rob Herring <robh+dt@kernel.org>; Krzysztof
>> Kozlowski <krzysztof.kozlowski+dt@linaro.org>; Joel Stanley <joel@jms.id.au>;
>> Andrew Jeffery <andrew@aj.id.au>; Dhananjay Phadke
>> <dhphadke@microsoft.com>; Johnny Huang
>> <johnny_huang@aspeedtech.com>
>> Cc: linux-aspeed@lists.ozlabs.org; linux-crypto@vger.kernel.org;
>> devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> linux-kernel@vger.kernel.org; BMC-SW <BMC-SW@aspeedtech.com>
>> Subject: Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
>>
>>
>> On 2022/7/26 19:34, Neal Liu wrote:
>>> Hash and Crypto Engine (HACE) is designed to accelerate the
>>> throughput of hash data digest, encryption, and decryption.
>>>
>>> Basically, HACE can be divided into two independently engines
>>> - Hash Engine and Crypto Engine. This patch aims to add HACE
>>> hash engine driver for hash accelerator.
>>>
>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
>>> ---
>>>  MAINTAINERS                              |    7 +
>>>  drivers/crypto/Kconfig                   |    1 +
>>>  drivers/crypto/Makefile                  |    1 +
>>>  drivers/crypto/aspeed/Kconfig            |   32 +
>>>  drivers/crypto/aspeed/Makefile           |    6 +
>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
>> ++++++++++++++++++++++
>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>>>  8 files changed, 1835 insertions(+)
>>>  create mode 100644 drivers/crypto/aspeed/Kconfig
>>>  create mode 100644 drivers/crypto/aspeed/Makefile
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index f55aea311af5..23a0215b7e42 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3140,6 +3140,13 @@ S:	Maintained
>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>>>  F:	drivers/media/platform/aspeed/
>>>
>>> +ASPEED CRYPTO DRIVER
>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>>> +S:	Maintained
>>> +F:
>> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
>>> +F:	drivers/crypto/aspeed/
>>> +
>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>>  M:	Corentin Chary <corentin.chary@gmail.com>
>>>  L:	acpi4asus-user@lists.sourceforge.net
>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
>>> index ee99c02c84e8..b9f5ee126881 100644
>>> --- a/drivers/crypto/Kconfig
>>> +++ b/drivers/crypto/Kconfig
>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>>>  	  acceleration for cryptographic algorithms on these devices.
>>>
>>>  source "drivers/crypto/keembay/Kconfig"
>>> +source "drivers/crypto/aspeed/Kconfig"
>>>
>>>  endif # CRYPTO_HW
>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
>>> index f81703a86b98..116de173a66c 100644
>>> --- a/drivers/crypto/Makefile
>>> +++ b/drivers/crypto/Makefile
>>> @@ -1,5 +1,6 @@
>>>  # SPDX-License-Identifier: GPL-2.0
>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
>>> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
>>> new file mode 100644
>>> index 000000000000..059e627efef8
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Kconfig
>>> @@ -0,0 +1,32 @@
>>> +config CRYPTO_DEV_ASPEED
>>> +	tristate "Support for Aspeed cryptographic engine driver"
>>> +	depends on ARCH_ASPEED
>>> +	help
>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
>>> +	  throughput of hash data digest, encryption and decryption.
>>> +
>>> +	  Select y here to have support for the cryptographic driver
>>> +	  available on Aspeed SoC.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
>>> +	depends on CRYPTO_DEV_ASPEED
>>> +	select CRYPTO_ENGINE
>>> +	select CRYPTO_SHA1
>>> +	select CRYPTO_SHA256
>>> +	select CRYPTO_SHA512
>>> +	select CRYPTO_HMAC
>>> +	help
>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
>>> +	  hash driver.
>>> +	  Supports multiple message digest standards, including
>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> +	bool "Enable HACE hash debug messages"
>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
>>> +	help
>>> +	  Print HACE hash debugging messages if you use this option
>>> +	  to ask for those messages.
>>> +	  Avoid enabling this option for production build to
>>> +	  minimize driver timing.
>>> diff --git a/drivers/crypto/aspeed/Makefile
>> b/drivers/crypto/aspeed/Makefile
>>> new file mode 100644
>>> index 000000000000..8bc8d4fed5a9
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Makefile
>>> @@ -0,0 +1,6 @@
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
>>> +aspeed_crypto-objs := aspeed-hace.o \
>>> +		      $(hace-hash-y)
>>> +
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
>> aspeed-hace-hash.o
>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> new file mode 100644
>>> index 000000000000..63a8ad694996
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> @@ -0,0 +1,1389 @@
>>> +// SPDX-License-Identifier: GPL-2.0+
>>> +/*
>>> + * Copyright (c) 2021 Aspeed Technology Inc.
>>> + */
>>> +
>>> +#include "aspeed-hace.h"
>>> +
>>> +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> +#define AHASH_DBG(h, fmt, ...)	\
>>> +	dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#else
>>> +#define AHASH_DBG(h, fmt, ...)	\
>>> +	dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#endif
>>> +
>>> +/* Initialization Vectors for SHA-family */
>>> +static const __be32 sha1_iv[8] = {
>>> +	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
>>> +	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
>>> +	cpu_to_be32(SHA1_H4), 0, 0, 0
>>> +};
>>> +
>>> +static const __be32 sha224_iv[8] = {
>>> +	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
>>> +	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
>>> +	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
>>> +	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
>>> +};
>>> +
>>> +static const __be32 sha256_iv[8] = {
>>> +	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
>>> +	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
>>> +	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
>>> +	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
>>> +};
>>> +
>>> +static const __be64 sha384_iv[8] = {
>>> +	cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
>>> +	cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
>>> +	cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
>>> +	cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
>>> +};
>>> +
>>> +static const __be64 sha512_iv[8] = {
>>> +	cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
>>> +	cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
>>> +	cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
>>> +	cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
>>> +};
>>> +
>>> +static const __be32 sha512_224_iv[16] = {
>>> +	cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
>>> +	cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
>>> +	cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
>>> +	cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
>>> +	cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
>>> +	cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
>>> +	cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
>>> +	cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
>>> +};
>>> +
>>> +static const __be32 sha512_256_iv[16] = {
>>> +	cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
>>> +	cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
>>> +	cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
>>> +	cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
>>> +	cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
>>> +	cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
>>> +	cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
>>> +	cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
>>> +};
>>> +
>>> +/* The purpose of this padding is to ensure that the padded message is a
>>> + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits
>> (SHA384/SHA512).
>>> + * The bit "1" is appended at the end of the message followed by
>>> + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
>>> + * 128 bits block (SHA384/SHA512) equals to the message length in bits
>>> + * is appended.
>>> + *
>>> + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
>>> + *  - if message length < 56 bytes then padlen = 56 - message length
>>> + *  - else padlen = 64 + 56 - message length
>>> + *
>>> + * For SHA384/SHA512, padlen is calculated as followed:
>>> + *  - if message length < 112 bytes then padlen = 112 - message length
>>> + *  - else padlen = 128 + 112 - message length
>>> + */
>>> +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
>>> +				      struct aspeed_sham_reqctx *rctx)
>>> +{
>>> +	unsigned int index, padlen;
>>> +	__be64 bits[2];
>>> +
>>> +	AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
>>> +
>>> +	switch (rctx->flags & SHA_FLAGS_MASK) {
>>> +	case SHA_FLAGS_SHA1:
>>> +	case SHA_FLAGS_SHA224:
>>> +	case SHA_FLAGS_SHA256:
>>> +		bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> +		index = rctx->bufcnt & 0x3f;
>>> +		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
>>> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
>>> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
>>> +		rctx->bufcnt += padlen + 8;
>>> +		break;
>>> +	default:
>>> +		bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> +		bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
>>> +				      rctx->digcnt[0] >> 61);
>>> +		index = rctx->bufcnt & 0x7f;
>>> +		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
>>> +		*(rctx->buffer + rctx->bufcnt) = 0x80;
>>> +		memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> +		memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
>>> +		rctx->bufcnt += padlen + 16;
>>> +		break;
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer before hardware engine
>>> + * processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +	int length, remain;
>>> +
>>> +	length = rctx->total + rctx->bufcnt;
>>> +	remain = length % rctx->block_size;
>>> +
>>> +	AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
>>> +
>>> +	if (rctx->bufcnt)
>>> +		memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
>>> +
>>> +	if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
>>> +		scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
>>> +					 rctx->bufcnt, rctx->src_sg,
>>> +					 rctx->offset, rctx->total - remain, 0);
>>> +		rctx->offset += rctx->total - remain;
>>> +
>>> +	} else {
>>> +		dev_warn(hace_dev->dev, "Hash data length is too large\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
>>> +				 rctx->offset, remain, 0);
>>> +
>>> +	rctx->bufcnt = remain;
>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> +					       SHA512_DIGEST_SIZE,
>>> +					       DMA_BIDIRECTIONAL);
>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	hash_engine->src_length = length - remain;
>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer as SG list buffer before
>>> + * hardware engine processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
>> *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +	struct aspeed_sg_list *src_list;
>>> +	struct scatterlist *s;
>>> +	int length, remain, sg_len, i;
>>> +	int rc = 0;
>>> +
>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
>>> +	length = rctx->total + rctx->bufcnt - remain;
>>> +
>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
>>> +		  "length", length, "remain", remain);
>>> +
>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> +			    DMA_TO_DEVICE);
>>> +	if (!sg_len) {
>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
>>> +		rc = -ENOMEM;
>>> +		goto end;
>>> +	}
>>> +
>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> +					       SHA512_DIGEST_SIZE,
>>> +					       DMA_BIDIRECTIONAL);
>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> +		rc = -ENOMEM;
>>> +		goto free_src_sg;
>>> +	}
>>> +
>>> +	if (rctx->bufcnt != 0) {
>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
>>> +						       rctx->buffer,
>>> +						       rctx->block_size * 2,
>>> +						       DMA_TO_DEVICE);
>>> +		if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
>>> +			rc = -ENOMEM;
>>> +			goto free_rctx_digest;
>>> +		}
>>> +
>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
>>> +		src_list[0].len = rctx->bufcnt;
>>> +		length -= src_list[0].len;
>>> +
>>> +		/* Last sg list */
>>> +		if (length == 0)
>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
>>> +
>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
>>> +		src_list++;
>>> +	}
>>> +
>>> +	if (length != 0) {
>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
>>> +			src_list[i].phy_addr = sg_dma_address(s);
>>> +
>>> +			if (length > sg_dma_len(s)) {
>>> +				src_list[i].len = sg_dma_len(s);
>>> +				length -= sg_dma_len(s);
>>> +
>>> +			} else {
>>> +				/* Last sg list */
>>> +				src_list[i].len = length;
>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
>>> +				length = 0;
>>> +			}
>>> +
>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
>>> +		}
>>> +	}
>>> +
>>> +	if (length != 0) {
>>> +		rc = -EINVAL;
>>> +		goto free_rctx_buffer;
>>> +	}
>>> +
>>> +	rctx->offset = rctx->total - remain;
>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> +	goto end;
>> Exiting via "goto xx" is not recommended in normal code logic (this requires
>> two jumps),
>> exiting via "return 0" is more efficient.
>> This code method has many times in your entire driver, it is recommended to
>> modify it.
> 
> If not exiting via "goto xx", how to release related resources without any problem?
> Is there any proper way to do this?
maybe I didn't describe it clearly enough.
"in normal code logic"  means rc=0
In this scenario (rc=0), "goto xx" is no longer required,
it can be replaced with "return 0"
> 
>>> +
>>> +free_rctx_buffer:
>>> +	if (rctx->bufcnt != 0)
>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
>>> +free_rctx_digest:
>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +free_src_sg:
>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> +		     DMA_TO_DEVICE);
>>> +end:
>>> +	return rc;
>>> +}
>>> +
>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +
>>> +	AHASH_DBG(hace_dev, "\n");
>>> +
>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
>>> +
>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Copy digest to the corresponding request result.
>>> + * This function will be called at final() stage.
>>> + */
>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> +	AHASH_DBG(hace_dev, "\n");
>>> +
>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +
>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
>>> +
>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
>>> +
>>> +	return aspeed_ahash_complete(hace_dev);
>>> +}
>>> +
>>> +/*
>>> + * Trigger hardware engines to do the math.
>>> + */
>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
>>> +				     aspeed_hace_fn_t resume)
>>> +{
>>> +	struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> +	struct ahash_request *req = hash_engine->req;
>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
>> length:0x%x\n",
>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
>>> +		  hash_engine->src_length);
>>> +
>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
>>> +	hash_engine->resume = resume;
>>> +
>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
>> ASPEED_HACE_HASH_SRC);
>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
>>> +		       ASPEED_HACE_HASH_DATA_LEN);
>>> +
>>> +	/* Memory barrier to ensure all data setup before engine starts */
>>> +	mb();
>>> +
>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
>> A hardware service sending requires 5 hardware commands to complete.
>> In a multi-concurrency scenario, how to ensure the order of commands?
>> (If two processes send hardware task at the same time,
>> How to ensure that the hardware recognizes which task the current
>> command belongs to?)
> 
> Linux crypto engine would guarantee that only one request at each time to be dequeued from engine queue to process.
> And there has lock mechanism inside Linux crypto engine to prevent the scenario you mentioned.
> So only 1 aspeed_hace_ahash_trigger() hardware service would go through at a time.
> 
> [...]
> .
> 
You may not understand what I mean, the command flow in a normal scenario:
request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
In a multi-process concurrent scenario, multiple crypto engines can be enabled,
and each crypto engine sends a request. If multiple requests here enter
aspeed_hace_ahash_trigger() at the same time, the command flow will be
intertwined like this:
request_A, request_B: Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5

In this command flow, how does your hardware identify whether these commands
belong to request_A or request_B?
Thanks.
Longfang.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-08-08 11:49         ` liulongfang
@ 2022-08-09  7:39           ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-09  7:39 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> >> On 2022/7/26 19:34, Neal Liu wrote:
> >>> Hash and Crypto Engine (HACE) is designed to accelerate the
> >>> throughput of hash data digest, encryption, and decryption.
> >>>
> >>> Basically, HACE can be divided into two independently engines
> >>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
> >>> engine driver for hash accelerator.
> >>>
> >>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> >>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> >>> ---
> >>>  MAINTAINERS                              |    7 +
> >>>  drivers/crypto/Kconfig                   |    1 +
> >>>  drivers/crypto/Makefile                  |    1 +
> >>>  drivers/crypto/aspeed/Kconfig            |   32 +
> >>>  drivers/crypto/aspeed/Makefile           |    6 +
> >>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> >> ++++++++++++++++++++++
> >>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >>>  8 files changed, 1835 insertions(+)  create mode 100644
> >>> drivers/crypto/aspeed/Kconfig  create mode 100644
> >>> drivers/crypto/aspeed/Makefile  create mode 100644
> >>> drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >>>
> >>> diff --git a/MAINTAINERS b/MAINTAINERS index
> >>> f55aea311af5..23a0215b7e42 100644
> >>> --- a/MAINTAINERS
> >>> +++ b/MAINTAINERS
> >>> @@ -3140,6 +3140,13 @@ S:	Maintained
> >>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >>>  F:	drivers/media/platform/aspeed/
> >>>
> >>> +ASPEED CRYPTO DRIVER
> >>> +M:	Neal Liu <neal_liu@aspeedtech.com>
> >>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> >>> +S:	Maintained
> >>> +F:
> >> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> >>> +F:	drivers/crypto/aspeed/
> >>> +
> >>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >>>  M:	Corentin Chary <corentin.chary@gmail.com>
> >>>  L:	acpi4asus-user@lists.sourceforge.net
> >>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
> >>> ee99c02c84e8..b9f5ee126881 100644
> >>> --- a/drivers/crypto/Kconfig
> >>> +++ b/drivers/crypto/Kconfig
> >>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >>>  	  acceleration for cryptographic algorithms on these devices.
> >>>
> >>>  source "drivers/crypto/keembay/Kconfig"
> >>> +source "drivers/crypto/aspeed/Kconfig"
> >>>
> >>>  endif # CRYPTO_HW
> >>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index
> >>> f81703a86b98..116de173a66c 100644
> >>> --- a/drivers/crypto/Makefile
> >>> +++ b/drivers/crypto/Makefile
> >>> @@ -1,5 +1,6 @@
> >>>  # SPDX-License-Identifier: GPL-2.0
> >>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
> >>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig new
> >>> file mode 100644 index 000000000000..059e627efef8
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/Kconfig
> >>> @@ -0,0 +1,32 @@
> >>> +config CRYPTO_DEV_ASPEED
> >>> +	tristate "Support for Aspeed cryptographic engine driver"
> >>> +	depends on ARCH_ASPEED
> >>> +	help
> >>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> >>> +	  throughput of hash data digest, encryption and decryption.
> >>> +
> >>> +	  Select y here to have support for the cryptographic driver
> >>> +	  available on Aspeed SoC.
> >>> +
> >>> +config CRYPTO_DEV_ASPEED_HACE_HASH
> >>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> >>> +	depends on CRYPTO_DEV_ASPEED
> >>> +	select CRYPTO_ENGINE
> >>> +	select CRYPTO_SHA1
> >>> +	select CRYPTO_SHA256
> >>> +	select CRYPTO_SHA512
> >>> +	select CRYPTO_HMAC
> >>> +	help
> >>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> >>> +	  hash driver.
> >>> +	  Supports multiple message digest standards, including
> >>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> >>> +
> >>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >>> +	bool "Enable HACE hash debug messages"
> >>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> >>> +	help
> >>> +	  Print HACE hash debugging messages if you use this option
> >>> +	  to ask for those messages.
> >>> +	  Avoid enabling this option for production build to
> >>> +	  minimize driver timing.
> >>> diff --git a/drivers/crypto/aspeed/Makefile
> >> b/drivers/crypto/aspeed/Makefile
> >>> new file mode 100644
> >>> index 000000000000..8bc8d4fed5a9
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/Makefile
> >>> @@ -0,0 +1,6 @@
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> >>> +aspeed_crypto-objs := aspeed-hace.o \
> >>> +		      $(hace-hash-y)
> >>> +
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> aspeed-hace-hash.o
> >>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> >> aspeed-hace-hash.o
> >>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> >> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>> new file mode 100644
> >>> index 000000000000..63a8ad694996
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>> @@ -0,0 +1,1389 @@

[...]

> >>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> >> *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +	struct aspeed_sg_list *src_list;
> >>> +	struct scatterlist *s;
> >>> +	int length, remain, sg_len, i;
> >>> +	int rc = 0;
> >>> +
> >>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> >>> +	length = rctx->total + rctx->bufcnt - remain;
> >>> +
> >>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> >>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> >>> +		  "length", length, "remain", remain);
> >>> +
> >>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>> +			    DMA_TO_DEVICE);
> >>> +	if (!sg_len) {
> >>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> >>> +		rc = -ENOMEM;
> >>> +		goto end;
> >>> +	}
> >>> +
> >>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> >>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> rctx->digest,
> >>> +					       SHA512_DIGEST_SIZE,
> >>> +					       DMA_BIDIRECTIONAL);
> >>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> >>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> >>> +		rc = -ENOMEM;
> >>> +		goto free_src_sg;
> >>> +	}
> >>> +
> >>> +	if (rctx->bufcnt != 0) {
> >>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> >>> +						       rctx->buffer,
> >>> +						       rctx->block_size * 2,
> >>> +						       DMA_TO_DEVICE);
> >>> +		if (dma_mapping_error(hace_dev->dev,
> rctx->buffer_dma_addr)) {
> >>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
> error\n");
> >>> +			rc = -ENOMEM;
> >>> +			goto free_rctx_digest;
> >>> +		}
> >>> +
> >>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> >>> +		src_list[0].len = rctx->bufcnt;
> >>> +		length -= src_list[0].len;
> >>> +
> >>> +		/* Last sg list */
> >>> +		if (length == 0)
> >>> +			src_list[0].len |= HASH_SG_LAST_LIST;
> >>> +
> >>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> >>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> >>> +		src_list++;
> >>> +	}
> >>> +
> >>> +	if (length != 0) {
> >>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> >>> +			src_list[i].phy_addr = sg_dma_address(s);
> >>> +
> >>> +			if (length > sg_dma_len(s)) {
> >>> +				src_list[i].len = sg_dma_len(s);
> >>> +				length -= sg_dma_len(s);
> >>> +
> >>> +			} else {
> >>> +				/* Last sg list */
> >>> +				src_list[i].len = length;
> >>> +				src_list[i].len |= HASH_SG_LAST_LIST;
> >>> +				length = 0;
> >>> +			}
> >>> +
> >>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> >>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (length != 0) {
> >>> +		rc = -EINVAL;
> >>> +		goto free_rctx_buffer;
> >>> +	}
> >>> +
> >>> +	rctx->offset = rctx->total - remain;
> >>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> >>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> >>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> >>> +
> >>> +	goto end;
> >> Exiting via "goto xx" is not recommended in normal code logic (this
> >> requires two jumps), exiting via "return 0" is more efficient.
> >> This code method has many times in your entire driver, it is
> >> recommended to modify it.
> >
> > If not exiting via "goto xx", how to release related resources without any
> problem?
> > Is there any proper way to do this?
> maybe I didn't describe it clearly enough.
> "in normal code logic"  means rc=0
> In this scenario (rc=0), "goto xx" is no longer required, it can be replaced with
> "return 0"

Okay, I got your point. In this case, "goto end" is no longer required of course.
I would send next patch with this fixed included.

> >
> >>> +
> >>> +free_rctx_buffer:
> >>> +	if (rctx->bufcnt != 0)
> >>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> >>> +free_rctx_digest:
> >>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>> +free_src_sg:
> >>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>> +		     DMA_TO_DEVICE);
> >>> +end:
> >>> +	return rc;
> >>> +}
> >>> +
> >>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +
> >>> +	AHASH_DBG(hace_dev, "\n");
> >>> +
> >>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> >>> +
> >>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +/*
> >>> + * Copy digest to the corresponding request result.
> >>> + * This function will be called at final() stage.
> >>> + */
> >>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +
> >>> +	AHASH_DBG(hace_dev, "\n");
> >>> +
> >>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>> +
> >>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> >>> +
> >>> +	memcpy(req->result, rctx->digest, rctx->digsize);
> >>> +
> >>> +	return aspeed_ahash_complete(hace_dev); }
> >>> +
> >>> +/*
> >>> + * Trigger hardware engines to do the math.
> >>> + */
> >>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
> *hace_dev,
> >>> +				     aspeed_hace_fn_t resume)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +
> >>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> >> length:0x%x\n",
> >>> +		  hash_engine->src_dma, hash_engine->digest_dma,
> >>> +		  hash_engine->src_length);
> >>> +
> >>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> >>> +	hash_engine->resume = resume;
> >>> +
> >>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
> >> ASPEED_HACE_HASH_SRC);
> >>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> >>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>> +		       ASPEED_HACE_HASH_KEY_BUFF);
> >>> +	ast_hace_write(hace_dev, hash_engine->src_length,
> >>> +		       ASPEED_HACE_HASH_DATA_LEN);
> >>> +
> >>> +	/* Memory barrier to ensure all data setup before engine starts */
> >>> +	mb();
> >>> +
> >>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> >> A hardware service sending requires 5 hardware commands to complete.
> >> In a multi-concurrency scenario, how to ensure the order of commands?
> >> (If two processes send hardware task at the same time, How to ensure
> >> that the hardware recognizes which task the current command belongs
> >> to?)
> >
> > Linux crypto engine would guarantee that only one request at each time to
> be dequeued from engine queue to process.
> > And there has lock mechanism inside Linux crypto engine to prevent the
> scenario you mentioned.
> > So only 1 aspeed_hace_ahash_trigger() hardware service would go through
> at a time.
> >
> > [...]
> > .
> >
> You may not understand what I mean, the command flow in a normal scenario:
> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
> In a multi-process concurrent scenario, multiple crypto engines can be enabled,
> and each crypto engine sends a request. If multiple requests here enter
> aspeed_hace_ahash_trigger() at the same time, the command flow will be
> intertwined like this:
> request_A, request_B:
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> cmd5-->Bcmd5
> 
> In this command flow, how does your hardware identify whether these
> commands belong to request_A or request_B?
> Thanks.
> Longfang.

For my understanding, all requests will transfer into engine queue through crypto_transfer_hash_request_to_engine().
In your example, request_A & request_B would also enqueue into the engine queue, and pump out 1 request which might be FIFO to handle it.
crypto_pump_requests() will dequeue only 1 request at a time and to prepare_request() & do_one_request() if it's registered.
And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means no other requests would come in during aspeed_hace_ahash_trigger() whole process.
The command flow intertwined
Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5 would not exist in any scenario.
Correct me if I'm misunderstanding, Thanks.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-09  7:39           ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-09  7:39 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> >> On 2022/7/26 19:34, Neal Liu wrote:
> >>> Hash and Crypto Engine (HACE) is designed to accelerate the
> >>> throughput of hash data digest, encryption, and decryption.
> >>>
> >>> Basically, HACE can be divided into two independently engines
> >>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
> >>> engine driver for hash accelerator.
> >>>
> >>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> >>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> >>> ---
> >>>  MAINTAINERS                              |    7 +
> >>>  drivers/crypto/Kconfig                   |    1 +
> >>>  drivers/crypto/Makefile                  |    1 +
> >>>  drivers/crypto/aspeed/Kconfig            |   32 +
> >>>  drivers/crypto/aspeed/Makefile           |    6 +
> >>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> >> ++++++++++++++++++++++
> >>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >>>  8 files changed, 1835 insertions(+)  create mode 100644
> >>> drivers/crypto/aspeed/Kconfig  create mode 100644
> >>> drivers/crypto/aspeed/Makefile  create mode 100644
> >>> drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >>>
> >>> diff --git a/MAINTAINERS b/MAINTAINERS index
> >>> f55aea311af5..23a0215b7e42 100644
> >>> --- a/MAINTAINERS
> >>> +++ b/MAINTAINERS
> >>> @@ -3140,6 +3140,13 @@ S:	Maintained
> >>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >>>  F:	drivers/media/platform/aspeed/
> >>>
> >>> +ASPEED CRYPTO DRIVER
> >>> +M:	Neal Liu <neal_liu@aspeedtech.com>
> >>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> >>> +S:	Maintained
> >>> +F:
> >> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> >>> +F:	drivers/crypto/aspeed/
> >>> +
> >>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >>>  M:	Corentin Chary <corentin.chary@gmail.com>
> >>>  L:	acpi4asus-user@lists.sourceforge.net
> >>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
> >>> ee99c02c84e8..b9f5ee126881 100644
> >>> --- a/drivers/crypto/Kconfig
> >>> +++ b/drivers/crypto/Kconfig
> >>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >>>  	  acceleration for cryptographic algorithms on these devices.
> >>>
> >>>  source "drivers/crypto/keembay/Kconfig"
> >>> +source "drivers/crypto/aspeed/Kconfig"
> >>>
> >>>  endif # CRYPTO_HW
> >>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index
> >>> f81703a86b98..116de173a66c 100644
> >>> --- a/drivers/crypto/Makefile
> >>> +++ b/drivers/crypto/Makefile
> >>> @@ -1,5 +1,6 @@
> >>>  # SPDX-License-Identifier: GPL-2.0
> >>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
> >>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig new
> >>> file mode 100644 index 000000000000..059e627efef8
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/Kconfig
> >>> @@ -0,0 +1,32 @@
> >>> +config CRYPTO_DEV_ASPEED
> >>> +	tristate "Support for Aspeed cryptographic engine driver"
> >>> +	depends on ARCH_ASPEED
> >>> +	help
> >>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> >>> +	  throughput of hash data digest, encryption and decryption.
> >>> +
> >>> +	  Select y here to have support for the cryptographic driver
> >>> +	  available on Aspeed SoC.
> >>> +
> >>> +config CRYPTO_DEV_ASPEED_HACE_HASH
> >>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> >>> +	depends on CRYPTO_DEV_ASPEED
> >>> +	select CRYPTO_ENGINE
> >>> +	select CRYPTO_SHA1
> >>> +	select CRYPTO_SHA256
> >>> +	select CRYPTO_SHA512
> >>> +	select CRYPTO_HMAC
> >>> +	help
> >>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> >>> +	  hash driver.
> >>> +	  Supports multiple message digest standards, including
> >>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> >>> +
> >>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >>> +	bool "Enable HACE hash debug messages"
> >>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> >>> +	help
> >>> +	  Print HACE hash debugging messages if you use this option
> >>> +	  to ask for those messages.
> >>> +	  Avoid enabling this option for production build to
> >>> +	  minimize driver timing.
> >>> diff --git a/drivers/crypto/aspeed/Makefile
> >> b/drivers/crypto/aspeed/Makefile
> >>> new file mode 100644
> >>> index 000000000000..8bc8d4fed5a9
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/Makefile
> >>> @@ -0,0 +1,6 @@
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> >>> +aspeed_crypto-objs := aspeed-hace.o \
> >>> +		      $(hace-hash-y)
> >>> +
> >>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> aspeed-hace-hash.o
> >>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> >> aspeed-hace-hash.o
> >>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> >> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>> new file mode 100644
> >>> index 000000000000..63a8ad694996
> >>> --- /dev/null
> >>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>> @@ -0,0 +1,1389 @@

[...]

> >>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> >> *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +	struct aspeed_sg_list *src_list;
> >>> +	struct scatterlist *s;
> >>> +	int length, remain, sg_len, i;
> >>> +	int rc = 0;
> >>> +
> >>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> >>> +	length = rctx->total + rctx->bufcnt - remain;
> >>> +
> >>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> >>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> >>> +		  "length", length, "remain", remain);
> >>> +
> >>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>> +			    DMA_TO_DEVICE);
> >>> +	if (!sg_len) {
> >>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> >>> +		rc = -ENOMEM;
> >>> +		goto end;
> >>> +	}
> >>> +
> >>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> >>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> rctx->digest,
> >>> +					       SHA512_DIGEST_SIZE,
> >>> +					       DMA_BIDIRECTIONAL);
> >>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> >>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> >>> +		rc = -ENOMEM;
> >>> +		goto free_src_sg;
> >>> +	}
> >>> +
> >>> +	if (rctx->bufcnt != 0) {
> >>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> >>> +						       rctx->buffer,
> >>> +						       rctx->block_size * 2,
> >>> +						       DMA_TO_DEVICE);
> >>> +		if (dma_mapping_error(hace_dev->dev,
> rctx->buffer_dma_addr)) {
> >>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
> error\n");
> >>> +			rc = -ENOMEM;
> >>> +			goto free_rctx_digest;
> >>> +		}
> >>> +
> >>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> >>> +		src_list[0].len = rctx->bufcnt;
> >>> +		length -= src_list[0].len;
> >>> +
> >>> +		/* Last sg list */
> >>> +		if (length == 0)
> >>> +			src_list[0].len |= HASH_SG_LAST_LIST;
> >>> +
> >>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> >>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> >>> +		src_list++;
> >>> +	}
> >>> +
> >>> +	if (length != 0) {
> >>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> >>> +			src_list[i].phy_addr = sg_dma_address(s);
> >>> +
> >>> +			if (length > sg_dma_len(s)) {
> >>> +				src_list[i].len = sg_dma_len(s);
> >>> +				length -= sg_dma_len(s);
> >>> +
> >>> +			} else {
> >>> +				/* Last sg list */
> >>> +				src_list[i].len = length;
> >>> +				src_list[i].len |= HASH_SG_LAST_LIST;
> >>> +				length = 0;
> >>> +			}
> >>> +
> >>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> >>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (length != 0) {
> >>> +		rc = -EINVAL;
> >>> +		goto free_rctx_buffer;
> >>> +	}
> >>> +
> >>> +	rctx->offset = rctx->total - remain;
> >>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> >>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> >>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> >>> +
> >>> +	goto end;
> >> Exiting via "goto xx" is not recommended in normal code logic (this
> >> requires two jumps), exiting via "return 0" is more efficient.
> >> This code method has many times in your entire driver, it is
> >> recommended to modify it.
> >
> > If not exiting via "goto xx", how to release related resources without any
> problem?
> > Is there any proper way to do this?
> maybe I didn't describe it clearly enough.
> "in normal code logic"  means rc=0
> In this scenario (rc=0), "goto xx" is no longer required, it can be replaced with
> "return 0"

Okay, I got your point. In this case, "goto end" is no longer required of course.
I would send next patch with this fixed included.

> >
> >>> +
> >>> +free_rctx_buffer:
> >>> +	if (rctx->bufcnt != 0)
> >>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> >>> +free_rctx_digest:
> >>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>> +free_src_sg:
> >>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>> +		     DMA_TO_DEVICE);
> >>> +end:
> >>> +	return rc;
> >>> +}
> >>> +
> >>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +
> >>> +	AHASH_DBG(hace_dev, "\n");
> >>> +
> >>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> >>> +
> >>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +/*
> >>> + * Copy digest to the corresponding request result.
> >>> + * This function will be called at final() stage.
> >>> + */
> >>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +
> >>> +	AHASH_DBG(hace_dev, "\n");
> >>> +
> >>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>> +
> >>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> >>> +
> >>> +	memcpy(req->result, rctx->digest, rctx->digsize);
> >>> +
> >>> +	return aspeed_ahash_complete(hace_dev); }
> >>> +
> >>> +/*
> >>> + * Trigger hardware engines to do the math.
> >>> + */
> >>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
> *hace_dev,
> >>> +				     aspeed_hace_fn_t resume)
> >>> +{
> >>> +	struct aspeed_engine_hash *hash_engine =
> &hace_dev->hash_engine;
> >>> +	struct ahash_request *req = hash_engine->req;
> >>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>> +
> >>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> >> length:0x%x\n",
> >>> +		  hash_engine->src_dma, hash_engine->digest_dma,
> >>> +		  hash_engine->src_length);
> >>> +
> >>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> >>> +	hash_engine->resume = resume;
> >>> +
> >>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
> >> ASPEED_HACE_HASH_SRC);
> >>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> >>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>> +		       ASPEED_HACE_HASH_KEY_BUFF);
> >>> +	ast_hace_write(hace_dev, hash_engine->src_length,
> >>> +		       ASPEED_HACE_HASH_DATA_LEN);
> >>> +
> >>> +	/* Memory barrier to ensure all data setup before engine starts */
> >>> +	mb();
> >>> +
> >>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> >> A hardware service sending requires 5 hardware commands to complete.
> >> In a multi-concurrency scenario, how to ensure the order of commands?
> >> (If two processes send hardware task at the same time, How to ensure
> >> that the hardware recognizes which task the current command belongs
> >> to?)
> >
> > Linux crypto engine would guarantee that only one request at each time to
> be dequeued from engine queue to process.
> > And there has lock mechanism inside Linux crypto engine to prevent the
> scenario you mentioned.
> > So only 1 aspeed_hace_ahash_trigger() hardware service would go through
> at a time.
> >
> > [...]
> > .
> >
> You may not understand what I mean, the command flow in a normal scenario:
> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
> In a multi-process concurrent scenario, multiple crypto engines can be enabled,
> and each crypto engine sends a request. If multiple requests here enter
> aspeed_hace_ahash_trigger() at the same time, the command flow will be
> intertwined like this:
> request_A, request_B:
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> cmd5-->Bcmd5
> 
> In this command flow, how does your hardware identify whether these
> commands belong to request_A or request_B?
> Thanks.
> Longfang.

For my understanding, all requests will transfer into engine queue through crypto_transfer_hash_request_to_engine().
In your example, request_A & request_B would also enqueue into the engine queue, and pump out 1 request which might be FIFO to handle it.
crypto_pump_requests() will dequeue only 1 request at a time and to prepare_request() & do_one_request() if it's registered.
And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means no other requests would come in during aspeed_hace_ahash_trigger() whole process.
The command flow intertwined
Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5 would not exist in any scenario.
Correct me if I'm misunderstanding, Thanks.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-08-09  7:39           ` Neal Liu
@ 2022-08-09 12:39             ` liulongfang
  -1 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-09 12:39 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 2022/8/9 15:39, Neal Liu wrote:
>>>> On 2022/7/26 19:34, Neal Liu wrote:
>>>>> Hash and Crypto Engine (HACE) is designed to accelerate the
>>>>> throughput of hash data digest, encryption, and decryption.
>>>>>
>>>>> Basically, HACE can be divided into two independently engines
>>>>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
>>>>> engine driver for hash accelerator.
>>>>>
>>>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
>>>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
>>>>> ---
>>>>>  MAINTAINERS                              |    7 +
>>>>>  drivers/crypto/Kconfig                   |    1 +
>>>>>  drivers/crypto/Makefile                  |    1 +
>>>>>  drivers/crypto/aspeed/Kconfig            |   32 +
>>>>>  drivers/crypto/aspeed/Makefile           |    6 +
>>>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
>>>> ++++++++++++++++++++++
>>>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>>>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>>>>>  8 files changed, 1835 insertions(+)  create mode 100644
>>>>> drivers/crypto/aspeed/Kconfig  create mode 100644
>>>>> drivers/crypto/aspeed/Makefile  create mode 100644
>>>>> drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
>>>>>
>>>>> diff --git a/MAINTAINERS b/MAINTAINERS index
>>>>> f55aea311af5..23a0215b7e42 100644
>>>>> --- a/MAINTAINERS
>>>>> +++ b/MAINTAINERS
>>>>> @@ -3140,6 +3140,13 @@ S:	Maintained
>>>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>>>>>  F:	drivers/media/platform/aspeed/
>>>>>
>>>>> +ASPEED CRYPTO DRIVER
>>>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
>>>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>>>>> +S:	Maintained
>>>>> +F:
>>>> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
>>>>> +F:	drivers/crypto/aspeed/
>>>>> +
>>>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>>>>  M:	Corentin Chary <corentin.chary@gmail.com>
>>>>>  L:	acpi4asus-user@lists.sourceforge.net
>>>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
>>>>> ee99c02c84e8..b9f5ee126881 100644
>>>>> --- a/drivers/crypto/Kconfig
>>>>> +++ b/drivers/crypto/Kconfig
>>>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>>>>>  	  acceleration for cryptographic algorithms on these devices.
>>>>>
>>>>>  source "drivers/crypto/keembay/Kconfig"
>>>>> +source "drivers/crypto/aspeed/Kconfig"
>>>>>
>>>>>  endif # CRYPTO_HW
>>>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index
>>>>> f81703a86b98..116de173a66c 100644
>>>>> --- a/drivers/crypto/Makefile
>>>>> +++ b/drivers/crypto/Makefile
>>>>> @@ -1,5 +1,6 @@
>>>>>  # SPDX-License-Identifier: GPL-2.0
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
>>>>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig new
>>>>> file mode 100644 index 000000000000..059e627efef8
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/Kconfig
>>>>> @@ -0,0 +1,32 @@
>>>>> +config CRYPTO_DEV_ASPEED
>>>>> +	tristate "Support for Aspeed cryptographic engine driver"
>>>>> +	depends on ARCH_ASPEED
>>>>> +	help
>>>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
>>>>> +	  throughput of hash data digest, encryption and decryption.
>>>>> +
>>>>> +	  Select y here to have support for the cryptographic driver
>>>>> +	  available on Aspeed SoC.
>>>>> +
>>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
>>>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
>>>>> +	depends on CRYPTO_DEV_ASPEED
>>>>> +	select CRYPTO_ENGINE
>>>>> +	select CRYPTO_SHA1
>>>>> +	select CRYPTO_SHA256
>>>>> +	select CRYPTO_SHA512
>>>>> +	select CRYPTO_HMAC
>>>>> +	help
>>>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
>>>>> +	  hash driver.
>>>>> +	  Supports multiple message digest standards, including
>>>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
>>>>> +
>>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>>>> +	bool "Enable HACE hash debug messages"
>>>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
>>>>> +	help
>>>>> +	  Print HACE hash debugging messages if you use this option
>>>>> +	  to ask for those messages.
>>>>> +	  Avoid enabling this option for production build to
>>>>> +	  minimize driver timing.
>>>>> diff --git a/drivers/crypto/aspeed/Makefile
>>>> b/drivers/crypto/aspeed/Makefile
>>>>> new file mode 100644
>>>>> index 000000000000..8bc8d4fed5a9
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/Makefile
>>>>> @@ -0,0 +1,6 @@
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
>>>>> +aspeed_crypto-objs := aspeed-hace.o \
>>>>> +		      $(hace-hash-y)
>>>>> +
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
>> aspeed-hace-hash.o
>>>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
>>>> aspeed-hace-hash.o
>>>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>> new file mode 100644
>>>>> index 000000000000..63a8ad694996
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>> @@ -0,0 +1,1389 @@
> 
> [...]
> 
>>>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
>>>> *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +	struct aspeed_sg_list *src_list;
>>>>> +	struct scatterlist *s;
>>>>> +	int length, remain, sg_len, i;
>>>>> +	int rc = 0;
>>>>> +
>>>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
>>>>> +	length = rctx->total + rctx->bufcnt - remain;
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
>>>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
>>>>> +		  "length", length, "remain", remain);
>>>>> +
>>>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>>>> +			    DMA_TO_DEVICE);
>>>>> +	if (!sg_len) {
>>>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
>>>>> +		rc = -ENOMEM;
>>>>> +		goto end;
>>>>> +	}
>>>>> +
>>>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
>>>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
>> rctx->digest,
>>>>> +					       SHA512_DIGEST_SIZE,
>>>>> +					       DMA_BIDIRECTIONAL);
>>>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>>>> +		rc = -ENOMEM;
>>>>> +		goto free_src_sg;
>>>>> +	}
>>>>> +
>>>>> +	if (rctx->bufcnt != 0) {
>>>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
>>>>> +						       rctx->buffer,
>>>>> +						       rctx->block_size * 2,
>>>>> +						       DMA_TO_DEVICE);
>>>>> +		if (dma_mapping_error(hace_dev->dev,
>> rctx->buffer_dma_addr)) {
>>>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
>> error\n");
>>>>> +			rc = -ENOMEM;
>>>>> +			goto free_rctx_digest;
>>>>> +		}
>>>>> +
>>>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
>>>>> +		src_list[0].len = rctx->bufcnt;
>>>>> +		length -= src_list[0].len;
>>>>> +
>>>>> +		/* Last sg list */
>>>>> +		if (length == 0)
>>>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
>>>>> +
>>>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
>>>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
>>>>> +		src_list++;
>>>>> +	}
>>>>> +
>>>>> +	if (length != 0) {
>>>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
>>>>> +			src_list[i].phy_addr = sg_dma_address(s);
>>>>> +
>>>>> +			if (length > sg_dma_len(s)) {
>>>>> +				src_list[i].len = sg_dma_len(s);
>>>>> +				length -= sg_dma_len(s);
>>>>> +
>>>>> +			} else {
>>>>> +				/* Last sg list */
>>>>> +				src_list[i].len = length;
>>>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
>>>>> +				length = 0;
>>>>> +			}
>>>>> +
>>>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
>>>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	if (length != 0) {
>>>>> +		rc = -EINVAL;
>>>>> +		goto free_rctx_buffer;
>>>>> +	}
>>>>> +
>>>>> +	rctx->offset = rctx->total - remain;
>>>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
>>>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>>>> +
>>>>> +	goto end;
>>>> Exiting via "goto xx" is not recommended in normal code logic (this
>>>> requires two jumps), exiting via "return 0" is more efficient.
>>>> This code method has many times in your entire driver, it is
>>>> recommended to modify it.
>>>
>>> If not exiting via "goto xx", how to release related resources without any
>> problem?
>>> Is there any proper way to do this?
>> maybe I didn't describe it clearly enough.
>> "in normal code logic"  means rc=0
>> In this scenario (rc=0), "goto xx" is no longer required, it can be replaced with
>> "return 0"
> 
> Okay, I got your point. In this case, "goto end" is no longer required of course.
> I would send next patch with this fixed included.
> 
>>>
>>>>> +
>>>>> +free_rctx_buffer:
>>>>> +	if (rctx->bufcnt != 0)
>>>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
>>>>> +free_rctx_digest:
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>>>> +free_src_sg:
>>>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>>>> +		     DMA_TO_DEVICE);
>>>>> +end:
>>>>> +	return rc;
>>>>> +}
>>>>> +
>>>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "\n");
>>>>> +
>>>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
>>>>> +
>>>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Copy digest to the corresponding request result.
>>>>> + * This function will be called at final() stage.
>>>>> + */
>>>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "\n");
>>>>> +
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>>>> +
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
>>>>> +
>>>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
>>>>> +
>>>>> +	return aspeed_ahash_complete(hace_dev); }
>>>>> +
>>>>> +/*
>>>>> + * Trigger hardware engines to do the math.
>>>>> + */
>>>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
>> *hace_dev,
>>>>> +				     aspeed_hace_fn_t resume)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
>>>> length:0x%x\n",
>>>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
>>>>> +		  hash_engine->src_length);
>>>>> +
>>>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
>>>>> +	hash_engine->resume = resume;
>>>>> +
>>>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
>>>> ASPEED_HACE_HASH_SRC);
>>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
>>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
>>>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
>>>>> +		       ASPEED_HACE_HASH_DATA_LEN);
>>>>> +
>>>>> +	/* Memory barrier to ensure all data setup before engine starts */
>>>>> +	mb();
>>>>> +
>>>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
>>>> A hardware service sending requires 5 hardware commands to complete.
>>>> In a multi-concurrency scenario, how to ensure the order of commands?
>>>> (If two processes send hardware task at the same time, How to ensure
>>>> that the hardware recognizes which task the current command belongs
>>>> to?)
>>>
>>> Linux crypto engine would guarantee that only one request at each time to
>> be dequeued from engine queue to process.
>>> And there has lock mechanism inside Linux crypto engine to prevent the
>> scenario you mentioned.
>>> So only 1 aspeed_hace_ahash_trigger() hardware service would go through
>> at a time.
>>>
>>> [...]
>>> .
>>>
>> You may not understand what I mean, the command flow in a normal scenario:
>> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
>> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
>> In a multi-process concurrent scenario, multiple crypto engines can be enabled,
>> and each crypto engine sends a request. If multiple requests here enter
>> aspeed_hace_ahash_trigger() at the same time, the command flow will be
>> intertwined like this:
>> request_A, request_B:
>> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
>> cmd5-->Bcmd5
>>
>> In this command flow, how does your hardware identify whether these
>> commands belong to request_A or request_B?
>> Thanks.
>> Longfang.
> 
> For my understanding, all requests will transfer into engine queue through crypto_transfer_hash_request_to_engine().
> In your example, request_A & request_B would also enqueue into the engine queue, and pump out 1 request which might be FIFO to handle it.
> crypto_pump_requests() will dequeue only 1 request at a time and to prepare_request() & do_one_request() if it's registered.
> And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means no other requests would come in during aspeed_hace_ahash_trigger() whole process.
> The command flow intertwined
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5 would not exist in any scenario.
> Correct me if I'm misunderstanding, Thanks.
> 
> .
> 
At first,You need to understand the difference between threads and processes that I said.
In a multi-threaded scenario, all threads will share the engine of a process, there will only be one engine queue
to send cmds at the same time. However, in a multi-process scenario, each process will have its own engine queue,
and when running at the same time, there will be multiple queues sending cmds.

Then, I understand what you mean. Your driver uses the software queue of the encryption engine to ensure the cmds order.
This method can ensure that the cmds of multiple threads in one process are sent in order.
But you still need to consider the problem of multiple processes, when using your device for hash operations
in multiple user processes, there will be multiple crypto engine software queues sending commands at the same time.
Thanks
Longfang.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-09 12:39             ` liulongfang
  0 siblings, 0 replies; 32+ messages in thread
From: liulongfang @ 2022-08-09 12:39 UTC (permalink / raw)
  To: Neal Liu, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

On 2022/8/9 15:39, Neal Liu wrote:
>>>> On 2022/7/26 19:34, Neal Liu wrote:
>>>>> Hash and Crypto Engine (HACE) is designed to accelerate the
>>>>> throughput of hash data digest, encryption, and decryption.
>>>>>
>>>>> Basically, HACE can be divided into two independently engines
>>>>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
>>>>> engine driver for hash accelerator.
>>>>>
>>>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
>>>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
>>>>> ---
>>>>>  MAINTAINERS                              |    7 +
>>>>>  drivers/crypto/Kconfig                   |    1 +
>>>>>  drivers/crypto/Makefile                  |    1 +
>>>>>  drivers/crypto/aspeed/Kconfig            |   32 +
>>>>>  drivers/crypto/aspeed/Makefile           |    6 +
>>>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
>>>> ++++++++++++++++++++++
>>>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
>>>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
>>>>>  8 files changed, 1835 insertions(+)  create mode 100644
>>>>> drivers/crypto/aspeed/Kconfig  create mode 100644
>>>>> drivers/crypto/aspeed/Makefile  create mode 100644
>>>>> drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
>>>>>
>>>>> diff --git a/MAINTAINERS b/MAINTAINERS index
>>>>> f55aea311af5..23a0215b7e42 100644
>>>>> --- a/MAINTAINERS
>>>>> +++ b/MAINTAINERS
>>>>> @@ -3140,6 +3140,13 @@ S:	Maintained
>>>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>>>>>  F:	drivers/media/platform/aspeed/
>>>>>
>>>>> +ASPEED CRYPTO DRIVER
>>>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
>>>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>>>>> +S:	Maintained
>>>>> +F:
>>>> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
>>>>> +F:	drivers/crypto/aspeed/
>>>>> +
>>>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>>>>  M:	Corentin Chary <corentin.chary@gmail.com>
>>>>>  L:	acpi4asus-user@lists.sourceforge.net
>>>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
>>>>> ee99c02c84e8..b9f5ee126881 100644
>>>>> --- a/drivers/crypto/Kconfig
>>>>> +++ b/drivers/crypto/Kconfig
>>>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>>>>>  	  acceleration for cryptographic algorithms on these devices.
>>>>>
>>>>>  source "drivers/crypto/keembay/Kconfig"
>>>>> +source "drivers/crypto/aspeed/Kconfig"
>>>>>
>>>>>  endif # CRYPTO_HW
>>>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index
>>>>> f81703a86b98..116de173a66c 100644
>>>>> --- a/drivers/crypto/Makefile
>>>>> +++ b/drivers/crypto/Makefile
>>>>> @@ -1,5 +1,6 @@
>>>>>  # SPDX-License-Identifier: GPL-2.0
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
>>>>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig new
>>>>> file mode 100644 index 000000000000..059e627efef8
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/Kconfig
>>>>> @@ -0,0 +1,32 @@
>>>>> +config CRYPTO_DEV_ASPEED
>>>>> +	tristate "Support for Aspeed cryptographic engine driver"
>>>>> +	depends on ARCH_ASPEED
>>>>> +	help
>>>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
>>>>> +	  throughput of hash data digest, encryption and decryption.
>>>>> +
>>>>> +	  Select y here to have support for the cryptographic driver
>>>>> +	  available on Aspeed SoC.
>>>>> +
>>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
>>>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
>>>>> +	depends on CRYPTO_DEV_ASPEED
>>>>> +	select CRYPTO_ENGINE
>>>>> +	select CRYPTO_SHA1
>>>>> +	select CRYPTO_SHA256
>>>>> +	select CRYPTO_SHA512
>>>>> +	select CRYPTO_HMAC
>>>>> +	help
>>>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
>>>>> +	  hash driver.
>>>>> +	  Supports multiple message digest standards, including
>>>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
>>>>> +
>>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>>>> +	bool "Enable HACE hash debug messages"
>>>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
>>>>> +	help
>>>>> +	  Print HACE hash debugging messages if you use this option
>>>>> +	  to ask for those messages.
>>>>> +	  Avoid enabling this option for production build to
>>>>> +	  minimize driver timing.
>>>>> diff --git a/drivers/crypto/aspeed/Makefile
>>>> b/drivers/crypto/aspeed/Makefile
>>>>> new file mode 100644
>>>>> index 000000000000..8bc8d4fed5a9
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/Makefile
>>>>> @@ -0,0 +1,6 @@
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
>>>>> +aspeed_crypto-objs := aspeed-hace.o \
>>>>> +		      $(hace-hash-y)
>>>>> +
>>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
>> aspeed-hace-hash.o
>>>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
>>>> aspeed-hace-hash.o
>>>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>> new file mode 100644
>>>>> index 000000000000..63a8ad694996
>>>>> --- /dev/null
>>>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>>>> @@ -0,0 +1,1389 @@
> 
> [...]
> 
>>>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
>>>> *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +	struct aspeed_sg_list *src_list;
>>>>> +	struct scatterlist *s;
>>>>> +	int length, remain, sg_len, i;
>>>>> +	int rc = 0;
>>>>> +
>>>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
>>>>> +	length = rctx->total + rctx->bufcnt - remain;
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
>>>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
>>>>> +		  "length", length, "remain", remain);
>>>>> +
>>>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>>>> +			    DMA_TO_DEVICE);
>>>>> +	if (!sg_len) {
>>>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
>>>>> +		rc = -ENOMEM;
>>>>> +		goto end;
>>>>> +	}
>>>>> +
>>>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
>>>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
>> rctx->digest,
>>>>> +					       SHA512_DIGEST_SIZE,
>>>>> +					       DMA_BIDIRECTIONAL);
>>>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>>>> +		rc = -ENOMEM;
>>>>> +		goto free_src_sg;
>>>>> +	}
>>>>> +
>>>>> +	if (rctx->bufcnt != 0) {
>>>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
>>>>> +						       rctx->buffer,
>>>>> +						       rctx->block_size * 2,
>>>>> +						       DMA_TO_DEVICE);
>>>>> +		if (dma_mapping_error(hace_dev->dev,
>> rctx->buffer_dma_addr)) {
>>>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
>> error\n");
>>>>> +			rc = -ENOMEM;
>>>>> +			goto free_rctx_digest;
>>>>> +		}
>>>>> +
>>>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
>>>>> +		src_list[0].len = rctx->bufcnt;
>>>>> +		length -= src_list[0].len;
>>>>> +
>>>>> +		/* Last sg list */
>>>>> +		if (length == 0)
>>>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
>>>>> +
>>>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
>>>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
>>>>> +		src_list++;
>>>>> +	}
>>>>> +
>>>>> +	if (length != 0) {
>>>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
>>>>> +			src_list[i].phy_addr = sg_dma_address(s);
>>>>> +
>>>>> +			if (length > sg_dma_len(s)) {
>>>>> +				src_list[i].len = sg_dma_len(s);
>>>>> +				length -= sg_dma_len(s);
>>>>> +
>>>>> +			} else {
>>>>> +				/* Last sg list */
>>>>> +				src_list[i].len = length;
>>>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
>>>>> +				length = 0;
>>>>> +			}
>>>>> +
>>>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
>>>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	if (length != 0) {
>>>>> +		rc = -EINVAL;
>>>>> +		goto free_rctx_buffer;
>>>>> +	}
>>>>> +
>>>>> +	rctx->offset = rctx->total - remain;
>>>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
>>>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
>>>>> +
>>>>> +	goto end;
>>>> Exiting via "goto xx" is not recommended in normal code logic (this
>>>> requires two jumps), exiting via "return 0" is more efficient.
>>>> This code method has many times in your entire driver, it is
>>>> recommended to modify it.
>>>
>>> If not exiting via "goto xx", how to release related resources without any
>> problem?
>>> Is there any proper way to do this?
>> maybe I didn't describe it clearly enough.
>> "in normal code logic"  means rc=0
>> In this scenario (rc=0), "goto xx" is no longer required, it can be replaced with
>> "return 0"
> 
> Okay, I got your point. In this case, "goto end" is no longer required of course.
> I would send next patch with this fixed included.
> 
>>>
>>>>> +
>>>>> +free_rctx_buffer:
>>>>> +	if (rctx->bufcnt != 0)
>>>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
>>>>> +free_rctx_digest:
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>>>> +free_src_sg:
>>>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>>>> +		     DMA_TO_DEVICE);
>>>>> +end:
>>>>> +	return rc;
>>>>> +}
>>>>> +
>>>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "\n");
>>>>> +
>>>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
>>>>> +
>>>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Copy digest to the corresponding request result.
>>>>> + * This function will be called at final() stage.
>>>>> + */
>>>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "\n");
>>>>> +
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>>>> +
>>>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
>>>>> +
>>>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
>>>>> +
>>>>> +	return aspeed_ahash_complete(hace_dev); }
>>>>> +
>>>>> +/*
>>>>> + * Trigger hardware engines to do the math.
>>>>> + */
>>>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
>> *hace_dev,
>>>>> +				     aspeed_hace_fn_t resume)
>>>>> +{
>>>>> +	struct aspeed_engine_hash *hash_engine =
>> &hace_dev->hash_engine;
>>>>> +	struct ahash_request *req = hash_engine->req;
>>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>>>> +
>>>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
>>>> length:0x%x\n",
>>>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
>>>>> +		  hash_engine->src_length);
>>>>> +
>>>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
>>>>> +	hash_engine->resume = resume;
>>>>> +
>>>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
>>>> ASPEED_HACE_HASH_SRC);
>>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
>>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
>>>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
>>>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
>>>>> +		       ASPEED_HACE_HASH_DATA_LEN);
>>>>> +
>>>>> +	/* Memory barrier to ensure all data setup before engine starts */
>>>>> +	mb();
>>>>> +
>>>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
>>>> A hardware service sending requires 5 hardware commands to complete.
>>>> In a multi-concurrency scenario, how to ensure the order of commands?
>>>> (If two processes send hardware task at the same time, How to ensure
>>>> that the hardware recognizes which task the current command belongs
>>>> to?)
>>>
>>> Linux crypto engine would guarantee that only one request at each time to
>> be dequeued from engine queue to process.
>>> And there has lock mechanism inside Linux crypto engine to prevent the
>> scenario you mentioned.
>>> So only 1 aspeed_hace_ahash_trigger() hardware service would go through
>> at a time.
>>>
>>> [...]
>>> .
>>>
>> You may not understand what I mean, the command flow in a normal scenario:
>> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
>> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
>> In a multi-process concurrent scenario, multiple crypto engines can be enabled,
>> and each crypto engine sends a request. If multiple requests here enter
>> aspeed_hace_ahash_trigger() at the same time, the command flow will be
>> intertwined like this:
>> request_A, request_B:
>> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
>> cmd5-->Bcmd5
>>
>> In this command flow, how does your hardware identify whether these
>> commands belong to request_A or request_B?
>> Thanks.
>> Longfang.
> 
> For my understanding, all requests will transfer into engine queue through crypto_transfer_hash_request_to_engine().
> In your example, request_A & request_B would also enqueue into the engine queue, and pump out 1 request which might be FIFO to handle it.
> crypto_pump_requests() will dequeue only 1 request at a time and to prepare_request() & do_one_request() if it's registered.
> And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means no other requests would come in during aspeed_hace_ahash_trigger() whole process.
> The command flow intertwined
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5 would not exist in any scenario.
> Correct me if I'm misunderstanding, Thanks.
> 
> .
> 
At first,You need to understand the difference between threads and processes that I said.
In a multi-threaded scenario, all threads will share the engine of a process, there will only be one engine queue
to send cmds at the same time. However, in a multi-process scenario, each process will have its own engine queue,
and when running at the same time, there will be multiple queues sending cmds.

Then, I understand what you mean. Your driver uses the software queue of the encryption engine to ensure the cmds order.
This method can ensure that the cmds of multiple threads in one process are sent in order.
But you still need to consider the problem of multiple processes, when using your device for hash operations
in multiple user processes, there will be multiple crypto engine software queues sending commands at the same time.
Thanks
Longfang.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
  2022-08-09 12:39             ` liulongfang
@ 2022-08-11  3:31               ` Neal Liu
  -1 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-11  3:31 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> On 2022/8/9 15:39, Neal Liu wrote:
> >>>> On 2022/7/26 19:34, Neal Liu wrote:
> >>>>> Hash and Crypto Engine (HACE) is designed to accelerate the
> >>>>> throughput of hash data digest, encryption, and decryption.
> >>>>>
> >>>>> Basically, HACE can be divided into two independently engines
> >>>>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
> >>>>> engine driver for hash accelerator.
> >>>>>
> >>>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> >>>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> >>>>> ---
> >>>>>  MAINTAINERS                              |    7 +
> >>>>>  drivers/crypto/Kconfig                   |    1 +
> >>>>>  drivers/crypto/Makefile                  |    1 +
> >>>>>  drivers/crypto/aspeed/Kconfig            |   32 +
> >>>>>  drivers/crypto/aspeed/Makefile           |    6 +
> >>>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> >>>> ++++++++++++++++++++++
> >>>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >>>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >>>>>  8 files changed, 1835 insertions(+)  create mode 100644
> >>>>> drivers/crypto/aspeed/Kconfig  create mode 100644
> >>>>> drivers/crypto/aspeed/Makefile  create mode 100644
> >>>>> drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >>>>>
> >>>>> diff --git a/MAINTAINERS b/MAINTAINERS index
> >>>>> f55aea311af5..23a0215b7e42 100644
> >>>>> --- a/MAINTAINERS
> >>>>> +++ b/MAINTAINERS
> >>>>> @@ -3140,6 +3140,13 @@ S:	Maintained
> >>>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >>>>>  F:	drivers/media/platform/aspeed/
> >>>>>
> >>>>> +ASPEED CRYPTO DRIVER
> >>>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
> >>>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> >>>>> +S:	Maintained
> >>>>> +F:
> >>>>
> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> >>>>> +F:	drivers/crypto/aspeed/
> >>>>> +
> >>>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >>>>>  M:	Corentin Chary <corentin.chary@gmail.com>
> >>>>>  L:	acpi4asus-user@lists.sourceforge.net
> >>>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
> >>>>> ee99c02c84e8..b9f5ee126881 100644
> >>>>> --- a/drivers/crypto/Kconfig
> >>>>> +++ b/drivers/crypto/Kconfig
> >>>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >>>>>  	  acceleration for cryptographic algorithms on these devices.
> >>>>>
> >>>>>  source "drivers/crypto/keembay/Kconfig"
> >>>>> +source "drivers/crypto/aspeed/Kconfig"
> >>>>>
> >>>>>  endif # CRYPTO_HW
> >>>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> >>>>> index f81703a86b98..116de173a66c 100644
> >>>>> --- a/drivers/crypto/Makefile
> >>>>> +++ b/drivers/crypto/Makefile
> >>>>> @@ -1,5 +1,6 @@
> >>>>>  # SPDX-License-Identifier: GPL-2.0
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
> >>>>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> >>>>> new file mode 100644 index 000000000000..059e627efef8
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/Kconfig
> >>>>> @@ -0,0 +1,32 @@
> >>>>> +config CRYPTO_DEV_ASPEED
> >>>>> +	tristate "Support for Aspeed cryptographic engine driver"
> >>>>> +	depends on ARCH_ASPEED
> >>>>> +	help
> >>>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> >>>>> +	  throughput of hash data digest, encryption and decryption.
> >>>>> +
> >>>>> +	  Select y here to have support for the cryptographic driver
> >>>>> +	  available on Aspeed SoC.
> >>>>> +
> >>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
> >>>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> >>>>> +	depends on CRYPTO_DEV_ASPEED
> >>>>> +	select CRYPTO_ENGINE
> >>>>> +	select CRYPTO_SHA1
> >>>>> +	select CRYPTO_SHA256
> >>>>> +	select CRYPTO_SHA512
> >>>>> +	select CRYPTO_HMAC
> >>>>> +	help
> >>>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> >>>>> +	  hash driver.
> >>>>> +	  Supports multiple message digest standards, including
> >>>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> >>>>> +
> >>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >>>>> +	bool "Enable HACE hash debug messages"
> >>>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> >>>>> +	help
> >>>>> +	  Print HACE hash debugging messages if you use this option
> >>>>> +	  to ask for those messages.
> >>>>> +	  Avoid enabling this option for production build to
> >>>>> +	  minimize driver timing.
> >>>>> diff --git a/drivers/crypto/aspeed/Makefile
> >>>> b/drivers/crypto/aspeed/Makefile
> >>>>> new file mode 100644
> >>>>> index 000000000000..8bc8d4fed5a9
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/Makefile
> >>>>> @@ -0,0 +1,6 @@
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> >>>>> +aspeed_crypto-objs := aspeed-hace.o \
> >>>>> +		      $(hace-hash-y)
> >>>>> +
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> >> aspeed-hace-hash.o
> >>>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> >>>> aspeed-hace-hash.o
> >>>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>> new file mode 100644
> >>>>> index 000000000000..63a8ad694996
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>> @@ -0,0 +1,1389 @@
> >
> > [...]
> >
> >>>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> >>>> *hace_dev)
> >>>>> +{
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +	struct aspeed_sg_list *src_list;
> >>>>> +	struct scatterlist *s;
> >>>>> +	int length, remain, sg_len, i;
> >>>>> +	int rc = 0;
> >>>>> +
> >>>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> >>>>> +	length = rctx->total + rctx->bufcnt - remain;
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> >>>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> >>>>> +		  "length", length, "remain", remain);
> >>>>> +
> >>>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>>>> +			    DMA_TO_DEVICE);
> >>>>> +	if (!sg_len) {
> >>>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> >>>>> +		rc = -ENOMEM;
> >>>>> +		goto end;
> >>>>> +	}
> >>>>> +
> >>>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> >>>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> >> rctx->digest,
> >>>>> +					       SHA512_DIGEST_SIZE,
> >>>>> +					       DMA_BIDIRECTIONAL);
> >>>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> >>>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> >>>>> +		rc = -ENOMEM;
> >>>>> +		goto free_src_sg;
> >>>>> +	}
> >>>>> +
> >>>>> +	if (rctx->bufcnt != 0) {
> >>>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> >>>>> +						       rctx->buffer,
> >>>>> +						       rctx->block_size * 2,
> >>>>> +						       DMA_TO_DEVICE);
> >>>>> +		if (dma_mapping_error(hace_dev->dev,
> >> rctx->buffer_dma_addr)) {
> >>>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
> >> error\n");
> >>>>> +			rc = -ENOMEM;
> >>>>> +			goto free_rctx_digest;
> >>>>> +		}
> >>>>> +
> >>>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> >>>>> +		src_list[0].len = rctx->bufcnt;
> >>>>> +		length -= src_list[0].len;
> >>>>> +
> >>>>> +		/* Last sg list */
> >>>>> +		if (length == 0)
> >>>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
> >>>>> +
> >>>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> >>>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> >>>>> +		src_list++;
> >>>>> +	}
> >>>>> +
> >>>>> +	if (length != 0) {
> >>>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> >>>>> +			src_list[i].phy_addr = sg_dma_address(s);
> >>>>> +
> >>>>> +			if (length > sg_dma_len(s)) {
> >>>>> +				src_list[i].len = sg_dma_len(s);
> >>>>> +				length -= sg_dma_len(s);
> >>>>> +
> >>>>> +			} else {
> >>>>> +				/* Last sg list */
> >>>>> +				src_list[i].len = length;
> >>>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
> >>>>> +				length = 0;
> >>>>> +			}
> >>>>> +
> >>>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> >>>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>> +	if (length != 0) {
> >>>>> +		rc = -EINVAL;
> >>>>> +		goto free_rctx_buffer;
> >>>>> +	}
> >>>>> +
> >>>>> +	rctx->offset = rctx->total - remain;
> >>>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> >>>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> >>>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> >>>>> +
> >>>>> +	goto end;
> >>>> Exiting via "goto xx" is not recommended in normal code logic (this
> >>>> requires two jumps), exiting via "return 0" is more efficient.
> >>>> This code method has many times in your entire driver, it is
> >>>> recommended to modify it.
> >>>
> >>> If not exiting via "goto xx", how to release related resources
> >>> without any
> >> problem?
> >>> Is there any proper way to do this?
> >> maybe I didn't describe it clearly enough.
> >> "in normal code logic"  means rc=0
> >> In this scenario (rc=0), "goto xx" is no longer required, it can be
> >> replaced with "return 0"
> >
> > Okay, I got your point. In this case, "goto end" is no longer required of course.
> > I would send next patch with this fixed included.
> >
> >>>
> >>>>> +
> >>>>> +free_rctx_buffer:
> >>>>> +	if (rctx->bufcnt != 0)
> >>>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> >>>>> +free_rctx_digest:
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>>>> +free_src_sg:
> >>>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>>>> +		     DMA_TO_DEVICE);
> >>>>> +end:
> >>>>> +	return rc;
> >>>>> +}
> >>>>> +
> >>>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev
> >>>>> +*hace_dev) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "\n");
> >>>>> +
> >>>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> >>>>> +
> >>>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req,
> >>>>> +0);
> >>>>> +
> >>>>> +	return 0;
> >>>>> +}
> >>>>> +
> >>>>> +/*
> >>>>> + * Copy digest to the corresponding request result.
> >>>>> + * This function will be called at final() stage.
> >>>>> + */
> >>>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev
> >>>>> +*hace_dev) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "\n");
> >>>>> +
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>>>> +
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> >>>>> +
> >>>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
> >>>>> +
> >>>>> +	return aspeed_ahash_complete(hace_dev); }
> >>>>> +
> >>>>> +/*
> >>>>> + * Trigger hardware engines to do the math.
> >>>>> + */
> >>>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
> >> *hace_dev,
> >>>>> +				     aspeed_hace_fn_t resume) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> >>>> length:0x%x\n",
> >>>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
> >>>>> +		  hash_engine->src_length);
> >>>>> +
> >>>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> >>>>> +	hash_engine->resume = resume;
> >>>>> +
> >>>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
> >>>> ASPEED_HACE_HASH_SRC);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
> >>>>> +		       ASPEED_HACE_HASH_DATA_LEN);
> >>>>> +
> >>>>> +	/* Memory barrier to ensure all data setup before engine starts */
> >>>>> +	mb();
> >>>>> +
> >>>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> >>>> A hardware service sending requires 5 hardware commands to complete.
> >>>> In a multi-concurrency scenario, how to ensure the order of commands?
> >>>> (If two processes send hardware task at the same time, How to
> >>>> ensure that the hardware recognizes which task the current command
> >>>> belongs
> >>>> to?)
> >>>
> >>> Linux crypto engine would guarantee that only one request at each
> >>> time to
> >> be dequeued from engine queue to process.
> >>> And there has lock mechanism inside Linux crypto engine to prevent
> >>> the
> >> scenario you mentioned.
> >>> So only 1 aspeed_hace_ahash_trigger() hardware service would go
> >>> through
> >> at a time.
> >>>
> >>> [...]
> >>> .
> >>>
> >> You may not understand what I mean, the command flow in a normal
> scenario:
> >> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
> >> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
> >> In a multi-process concurrent scenario, multiple crypto engines can
> >> be enabled, and each crypto engine sends a request. If multiple
> >> requests here enter
> >> aspeed_hace_ahash_trigger() at the same time, the command flow will
> >> be intertwined like this:
> >> request_A, request_B:
> >>
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> >> cmd5-->Bcmd5
> >>
> >> In this command flow, how does your hardware identify whether these
> >> commands belong to request_A or request_B?
> >> Thanks.
> >> Longfang.
> >
> > For my understanding, all requests will transfer into engine queue through
> crypto_transfer_hash_request_to_engine().
> > In your example, request_A & request_B would also enqueue into the engine
> queue, and pump out 1 request which might be FIFO to handle it.
> > crypto_pump_requests() will dequeue only 1 request at a time and to
> prepare_request() & do_one_request() if it's registered.
> > And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means
> no other requests would come in during aspeed_hace_ahash_trigger() whole
> process.
> > The command flow intertwined
> >
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> cmd5-->Bcmd5 would not exist in any scenario.
> > Correct me if I'm misunderstanding, Thanks.
> >
> > .
> >
> At first,You need to understand the difference between threads and processes
> that I said.
> In a multi-threaded scenario, all threads will share the engine of a process,
> there will only be one engine queue to send cmds at the same time. However,
> in a multi-process scenario, each process will have its own engine queue, and
> when running at the same time, there will be multiple queues sending cmds.
> 
> Then, I understand what you mean. Your driver uses the software queue of the
> encryption engine to ensure the cmds order.
> This method can ensure that the cmds of multiple threads in one process are
> sent in order.
> But you still need to consider the problem of multiple processes, when using
> your device for hash operations in multiple user processes, there will be
> multiple crypto engine software queues sending commands at the same time.
> Thanks
> Longfang.

I got your point. There is one important thing you need to know. This driver instance shares one driver data, and one crypto engine queue. All requests will enqueue into same engine queue, and dequeue it with its order even if multi-process scenario. So the confusion here is each process will "not" have its own engine queue but only 1 engine queue inside this driver data.
And it allows only 1 instance driver with Aspeed HACE device.
Thanks !


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
@ 2022-08-11  3:31               ` Neal Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Neal Liu @ 2022-08-11  3:31 UTC (permalink / raw)
  To: liulongfang, Corentin Labbe, Christophe JAILLET, Randy Dunlap,
	Herbert Xu, David S . Miller, Rob Herring, Krzysztof Kozlowski,
	Joel Stanley, Andrew Jeffery, Dhananjay Phadke, Johnny Huang
  Cc: linux-aspeed@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, BMC-SW

> On 2022/8/9 15:39, Neal Liu wrote:
> >>>> On 2022/7/26 19:34, Neal Liu wrote:
> >>>>> Hash and Crypto Engine (HACE) is designed to accelerate the
> >>>>> throughput of hash data digest, encryption, and decryption.
> >>>>>
> >>>>> Basically, HACE can be divided into two independently engines
> >>>>> - Hash Engine and Crypto Engine. This patch aims to add HACE hash
> >>>>> engine driver for hash accelerator.
> >>>>>
> >>>>> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com>
> >>>>> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com>
> >>>>> ---
> >>>>>  MAINTAINERS                              |    7 +
> >>>>>  drivers/crypto/Kconfig                   |    1 +
> >>>>>  drivers/crypto/Makefile                  |    1 +
> >>>>>  drivers/crypto/aspeed/Kconfig            |   32 +
> >>>>>  drivers/crypto/aspeed/Makefile           |    6 +
> >>>>>  drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
> >>>> ++++++++++++++++++++++
> >>>>>  drivers/crypto/aspeed/aspeed-hace.c      |  213 ++++
> >>>>>  drivers/crypto/aspeed/aspeed-hace.h      |  186 +++
> >>>>>  8 files changed, 1835 insertions(+)  create mode 100644
> >>>>> drivers/crypto/aspeed/Kconfig  create mode 100644
> >>>>> drivers/crypto/aspeed/Makefile  create mode 100644
> >>>>> drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
> >>>>>  create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
> >>>>>
> >>>>> diff --git a/MAINTAINERS b/MAINTAINERS index
> >>>>> f55aea311af5..23a0215b7e42 100644
> >>>>> --- a/MAINTAINERS
> >>>>> +++ b/MAINTAINERS
> >>>>> @@ -3140,6 +3140,13 @@ S:	Maintained
> >>>>>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
> >>>>>  F:	drivers/media/platform/aspeed/
> >>>>>
> >>>>> +ASPEED CRYPTO DRIVER
> >>>>> +M:	Neal Liu <neal_liu@aspeedtech.com>
> >>>>> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> >>>>> +S:	Maintained
> >>>>> +F:
> >>>>
> 	Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
> >>>>> +F:	drivers/crypto/aspeed/
> >>>>> +
> >>>>>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
> >>>>>  M:	Corentin Chary <corentin.chary@gmail.com>
> >>>>>  L:	acpi4asus-user@lists.sourceforge.net
> >>>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index
> >>>>> ee99c02c84e8..b9f5ee126881 100644
> >>>>> --- a/drivers/crypto/Kconfig
> >>>>> +++ b/drivers/crypto/Kconfig
> >>>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
> >>>>>  	  acceleration for cryptographic algorithms on these devices.
> >>>>>
> >>>>>  source "drivers/crypto/keembay/Kconfig"
> >>>>> +source "drivers/crypto/aspeed/Kconfig"
> >>>>>
> >>>>>  endif # CRYPTO_HW
> >>>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> >>>>> index f81703a86b98..116de173a66c 100644
> >>>>> --- a/drivers/crypto/Makefile
> >>>>> +++ b/drivers/crypto/Makefile
> >>>>> @@ -1,5 +1,6 @@
> >>>>>  # SPDX-License-Identifier: GPL-2.0
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
> >>>>>  obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o diff --git
> >>>>> a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
> >>>>> new file mode 100644 index 000000000000..059e627efef8
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/Kconfig
> >>>>> @@ -0,0 +1,32 @@
> >>>>> +config CRYPTO_DEV_ASPEED
> >>>>> +	tristate "Support for Aspeed cryptographic engine driver"
> >>>>> +	depends on ARCH_ASPEED
> >>>>> +	help
> >>>>> +	  Hash and Crypto Engine (HACE) is designed to accelerate the
> >>>>> +	  throughput of hash data digest, encryption and decryption.
> >>>>> +
> >>>>> +	  Select y here to have support for the cryptographic driver
> >>>>> +	  available on Aspeed SoC.
> >>>>> +
> >>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
> >>>>> +	bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
> >>>>> +	depends on CRYPTO_DEV_ASPEED
> >>>>> +	select CRYPTO_ENGINE
> >>>>> +	select CRYPTO_SHA1
> >>>>> +	select CRYPTO_SHA256
> >>>>> +	select CRYPTO_SHA512
> >>>>> +	select CRYPTO_HMAC
> >>>>> +	help
> >>>>> +	  Select here to enable Aspeed Hash & Crypto Engine (HACE)
> >>>>> +	  hash driver.
> >>>>> +	  Supports multiple message digest standards, including
> >>>>> +	  SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
> >>>>> +
> >>>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
> >>>>> +	bool "Enable HACE hash debug messages"
> >>>>> +	depends on CRYPTO_DEV_ASPEED_HACE_HASH
> >>>>> +	help
> >>>>> +	  Print HACE hash debugging messages if you use this option
> >>>>> +	  to ask for those messages.
> >>>>> +	  Avoid enabling this option for production build to
> >>>>> +	  minimize driver timing.
> >>>>> diff --git a/drivers/crypto/aspeed/Makefile
> >>>> b/drivers/crypto/aspeed/Makefile
> >>>>> new file mode 100644
> >>>>> index 000000000000..8bc8d4fed5a9
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/Makefile
> >>>>> @@ -0,0 +1,6 @@
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
> >>>>> +aspeed_crypto-objs := aspeed-hace.o \
> >>>>> +		      $(hace-hash-y)
> >>>>> +
> >>>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) +=
> >> aspeed-hace-hash.o
> >>>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
> >>>> aspeed-hace-hash.o
> >>>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>> new file mode 100644
> >>>>> index 000000000000..63a8ad694996
> >>>>> --- /dev/null
> >>>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
> >>>>> @@ -0,0 +1,1389 @@
> >
> > [...]
> >
> >>>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
> >>>> *hace_dev)
> >>>>> +{
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +	struct aspeed_sg_list *src_list;
> >>>>> +	struct scatterlist *s;
> >>>>> +	int length, remain, sg_len, i;
> >>>>> +	int rc = 0;
> >>>>> +
> >>>>> +	remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
> >>>>> +	length = rctx->total + rctx->bufcnt - remain;
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
> >>>>> +		  "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
> >>>>> +		  "length", length, "remain", remain);
> >>>>> +
> >>>>> +	sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>>>> +			    DMA_TO_DEVICE);
> >>>>> +	if (!sg_len) {
> >>>>> +		dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
> >>>>> +		rc = -ENOMEM;
> >>>>> +		goto end;
> >>>>> +	}
> >>>>> +
> >>>>> +	src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
> >>>>> +	rctx->digest_dma_addr = dma_map_single(hace_dev->dev,
> >> rctx->digest,
> >>>>> +					       SHA512_DIGEST_SIZE,
> >>>>> +					       DMA_BIDIRECTIONAL);
> >>>>> +	if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
> >>>>> +		dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
> >>>>> +		rc = -ENOMEM;
> >>>>> +		goto free_src_sg;
> >>>>> +	}
> >>>>> +
> >>>>> +	if (rctx->bufcnt != 0) {
> >>>>> +		rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
> >>>>> +						       rctx->buffer,
> >>>>> +						       rctx->block_size * 2,
> >>>>> +						       DMA_TO_DEVICE);
> >>>>> +		if (dma_mapping_error(hace_dev->dev,
> >> rctx->buffer_dma_addr)) {
> >>>>> +			dev_warn(hace_dev->dev, "dma_map() rctx buffer
> >> error\n");
> >>>>> +			rc = -ENOMEM;
> >>>>> +			goto free_rctx_digest;
> >>>>> +		}
> >>>>> +
> >>>>> +		src_list[0].phy_addr = rctx->buffer_dma_addr;
> >>>>> +		src_list[0].len = rctx->bufcnt;
> >>>>> +		length -= src_list[0].len;
> >>>>> +
> >>>>> +		/* Last sg list */
> >>>>> +		if (length == 0)
> >>>>> +			src_list[0].len |= HASH_SG_LAST_LIST;
> >>>>> +
> >>>>> +		src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
> >>>>> +		src_list[0].len = cpu_to_le32(src_list[0].len);
> >>>>> +		src_list++;
> >>>>> +	}
> >>>>> +
> >>>>> +	if (length != 0) {
> >>>>> +		for_each_sg(rctx->src_sg, s, sg_len, i) {
> >>>>> +			src_list[i].phy_addr = sg_dma_address(s);
> >>>>> +
> >>>>> +			if (length > sg_dma_len(s)) {
> >>>>> +				src_list[i].len = sg_dma_len(s);
> >>>>> +				length -= sg_dma_len(s);
> >>>>> +
> >>>>> +			} else {
> >>>>> +				/* Last sg list */
> >>>>> +				src_list[i].len = length;
> >>>>> +				src_list[i].len |= HASH_SG_LAST_LIST;
> >>>>> +				length = 0;
> >>>>> +			}
> >>>>> +
> >>>>> +			src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
> >>>>> +			src_list[i].len = cpu_to_le32(src_list[i].len);
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>> +	if (length != 0) {
> >>>>> +		rc = -EINVAL;
> >>>>> +		goto free_rctx_buffer;
> >>>>> +	}
> >>>>> +
> >>>>> +	rctx->offset = rctx->total - remain;
> >>>>> +	hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
> >>>>> +	hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
> >>>>> +	hash_engine->digest_dma = rctx->digest_dma_addr;
> >>>>> +
> >>>>> +	goto end;
> >>>> Exiting via "goto xx" is not recommended in normal code logic (this
> >>>> requires two jumps), exiting via "return 0" is more efficient.
> >>>> This code method has many times in your entire driver, it is
> >>>> recommended to modify it.
> >>>
> >>> If not exiting via "goto xx", how to release related resources
> >>> without any
> >> problem?
> >>> Is there any proper way to do this?
> >> maybe I didn't describe it clearly enough.
> >> "in normal code logic"  means rc=0
> >> In this scenario (rc=0), "goto xx" is no longer required, it can be
> >> replaced with "return 0"
> >
> > Okay, I got your point. In this case, "goto end" is no longer required of course.
> > I would send next patch with this fixed included.
> >
> >>>
> >>>>> +
> >>>>> +free_rctx_buffer:
> >>>>> +	if (rctx->bufcnt != 0)
> >>>>> +		dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>>>> +				 rctx->block_size * 2, DMA_TO_DEVICE);
> >>>>> +free_rctx_digest:
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>>>> +free_src_sg:
> >>>>> +	dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
> >>>>> +		     DMA_TO_DEVICE);
> >>>>> +end:
> >>>>> +	return rc;
> >>>>> +}
> >>>>> +
> >>>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev
> >>>>> +*hace_dev) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "\n");
> >>>>> +
> >>>>> +	hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
> >>>>> +
> >>>>> +	crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req,
> >>>>> +0);
> >>>>> +
> >>>>> +	return 0;
> >>>>> +}
> >>>>> +
> >>>>> +/*
> >>>>> + * Copy digest to the corresponding request result.
> >>>>> + * This function will be called at final() stage.
> >>>>> + */
> >>>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev
> >>>>> +*hace_dev) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "\n");
> >>>>> +
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
> >>>>> +			 SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
> >>>>> +
> >>>>> +	dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
> >>>>> +			 rctx->block_size * 2, DMA_TO_DEVICE);
> >>>>> +
> >>>>> +	memcpy(req->result, rctx->digest, rctx->digsize);
> >>>>> +
> >>>>> +	return aspeed_ahash_complete(hace_dev); }
> >>>>> +
> >>>>> +/*
> >>>>> + * Trigger hardware engines to do the math.
> >>>>> + */
> >>>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev
> >> *hace_dev,
> >>>>> +				     aspeed_hace_fn_t resume) {
> >>>>> +	struct aspeed_engine_hash *hash_engine =
> >> &hace_dev->hash_engine;
> >>>>> +	struct ahash_request *req = hash_engine->req;
> >>>>> +	struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
> >>>>> +
> >>>>> +	AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
> >>>> length:0x%x\n",
> >>>>> +		  hash_engine->src_dma, hash_engine->digest_dma,
> >>>>> +		  hash_engine->src_length);
> >>>>> +
> >>>>> +	rctx->cmd |= HASH_CMD_INT_ENABLE;
> >>>>> +	hash_engine->resume = resume;
> >>>>> +
> >>>>> +	ast_hace_write(hace_dev, hash_engine->src_dma,
> >>>> ASPEED_HACE_HASH_SRC);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>>>> +		       ASPEED_HACE_HASH_DIGEST_BUFF);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->digest_dma,
> >>>>> +		       ASPEED_HACE_HASH_KEY_BUFF);
> >>>>> +	ast_hace_write(hace_dev, hash_engine->src_length,
> >>>>> +		       ASPEED_HACE_HASH_DATA_LEN);
> >>>>> +
> >>>>> +	/* Memory barrier to ensure all data setup before engine starts */
> >>>>> +	mb();
> >>>>> +
> >>>>> +	ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
> >>>> A hardware service sending requires 5 hardware commands to complete.
> >>>> In a multi-concurrency scenario, how to ensure the order of commands?
> >>>> (If two processes send hardware task at the same time, How to
> >>>> ensure that the hardware recognizes which task the current command
> >>>> belongs
> >>>> to?)
> >>>
> >>> Linux crypto engine would guarantee that only one request at each
> >>> time to
> >> be dequeued from engine queue to process.
> >>> And there has lock mechanism inside Linux crypto engine to prevent
> >>> the
> >> scenario you mentioned.
> >>> So only 1 aspeed_hace_ahash_trigger() hardware service would go
> >>> through
> >> at a time.
> >>>
> >>> [...]
> >>> .
> >>>
> >> You may not understand what I mean, the command flow in a normal
> scenario:
> >> request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
> >> request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
> >> In a multi-process concurrent scenario, multiple crypto engines can
> >> be enabled, and each crypto engine sends a request. If multiple
> >> requests here enter
> >> aspeed_hace_ahash_trigger() at the same time, the command flow will
> >> be intertwined like this:
> >> request_A, request_B:
> >>
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> >> cmd5-->Bcmd5
> >>
> >> In this command flow, how does your hardware identify whether these
> >> commands belong to request_A or request_B?
> >> Thanks.
> >> Longfang.
> >
> > For my understanding, all requests will transfer into engine queue through
> crypto_transfer_hash_request_to_engine().
> > In your example, request_A & request_B would also enqueue into the engine
> queue, and pump out 1 request which might be FIFO to handle it.
> > crypto_pump_requests() will dequeue only 1 request at a time and to
> prepare_request() & do_one_request() if it's registered.
> > And aspeed_hace_ahash_trigger() is inside do_one_request(), so that means
> no other requests would come in during aspeed_hace_ahash_trigger() whole
> process.
> > The command flow intertwined
> >
> Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->A
> cmd5-->Bcmd5 would not exist in any scenario.
> > Correct me if I'm misunderstanding, Thanks.
> >
> > .
> >
> At first,You need to understand the difference between threads and processes
> that I said.
> In a multi-threaded scenario, all threads will share the engine of a process,
> there will only be one engine queue to send cmds at the same time. However,
> in a multi-process scenario, each process will have its own engine queue, and
> when running at the same time, there will be multiple queues sending cmds.
> 
> Then, I understand what you mean. Your driver uses the software queue of the
> encryption engine to ensure the cmds order.
> This method can ensure that the cmds of multiple threads in one process are
> sent in order.
> But you still need to consider the problem of multiple processes, when using
> your device for hash operations in multiple user processes, there will be
> multiple crypto engine software queues sending commands at the same time.
> Thanks
> Longfang.

I got your point. There is one important thing you need to know. This driver instance shares one driver data, and one crypto engine queue. All requests will enqueue into same engine queue, and dequeue it with its order even if multi-process scenario. So the confusion here is each process will "not" have its own engine queue but only 1 engine queue inside this driver data.
And it allows only 1 instance driver with Aspeed HACE device.
Thanks !

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-08-11 11:06 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-26 11:34 [PATCH v8 0/5] Add Aspeed crypto driver for hardware acceleration Neal Liu
2022-07-26 11:34 ` Neal Liu
2022-07-26 11:34 ` [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver Neal Liu
2022-07-26 11:34   ` Neal Liu
2022-08-08  2:53   ` liulongfang
2022-08-08  2:53     ` liulongfang
2022-08-08  9:30     ` Neal Liu
2022-08-08  9:30       ` Neal Liu
2022-08-08 11:49       ` liulongfang
2022-08-08 11:49         ` liulongfang
2022-08-09  7:39         ` Neal Liu
2022-08-09  7:39           ` Neal Liu
2022-08-09 12:39           ` liulongfang
2022-08-09 12:39             ` liulongfang
2022-08-11  3:31             ` Neal Liu
2022-08-11  3:31               ` Neal Liu
2022-07-26 11:34 ` [PATCH v8 2/5] dt-bindings: clock: Add AST2500/AST2600 HACE reset definition Neal Liu
2022-07-26 11:34   ` Neal Liu
2022-07-26 11:34 ` [PATCH v8 3/5] ARM: dts: aspeed: Add HACE device controller node Neal Liu
2022-07-26 11:34   ` Neal Liu
2022-07-26 11:34 ` [PATCH v8 4/5] dt-bindings: crypto: add documentation for aspeed hace Neal Liu
2022-07-26 11:34   ` Neal Liu
2022-07-26 11:34 ` [PATCH v8 5/5] crypto: aspeed: add HACE crypto driver Neal Liu
2022-07-26 11:34   ` Neal Liu
2022-07-26 20:41   ` Dhananjay Phadke
2022-07-26 20:41     ` Dhananjay Phadke
2022-07-27  5:31     ` Neal Liu
2022-07-27  5:31       ` Neal Liu
2022-07-28  6:18       ` Dhananjay Phadke
2022-07-28  6:18         ` Dhananjay Phadke
2022-07-28  8:58         ` Neal Liu
2022-07-28  8:58           ` Neal Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.