Regarding crypto and HW acceleration...
Here is my "original" configuration with decent crypto performance, WRT1900ACS:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 9389.98k 32306.22k 94967.38k 167082.33k 219785.90k
sha1 6629.30k 18147.99k 37536.71k 51512.66k 58307.11k
des cbc 24228.04k 25278.26k 25511.73k 25512.96k 25650.00k
des ede3 9253.49k 9468.46k 9539.55k 9567.04k 9530.03k
aes-128 cbc 34693.14k 39816.11k 41683.80k 42328.19k 42254.34k
aes-192 cbc 31545.96k 34799.21k 36341.35k 36898.37k 35665.81k
aes-256 cbc 29286.46k 30779.94k 32155.47k 32528.73k 32691.29k
sha256 7234.63k 16872.51k 30481.92k 38228.65k 41390.15k
sha512 1761.70k 7015.13k 10239.23k 13994.67k 15676.76k
sign verify sign/s verify/s
rsa 2048 bits 0.034198s 0.000903s 29.2 1107.8
sign verify sign/s verify/s
dsa 2048 bits 0.009249s 0.011301s 108.1 88.5
Here is the version appending -engine cryptodev to the openssl speed command:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 4134.82k 15713.71k 58738.53k 222495.03k 1735304.53k
sha1 4461.88k 16943.40k 76576.12k 317807.59k 1961984.00k
des cbc 24098.35k 25154.11k 25470.29k 25617.12k 25542.66k
des ede3 9271.67k 9479.66k 9536.73k 9578.15k 9609.22k
aes-128 cbc 35480.47k 39827.33k 41341.14k 42358.78k 42677.85k
aes-192 cbc 31795.15k 35301.67k 36718.08k 37066.41k 37014.19k
aes-256 cbc 28420.27k 31476.22k 32688.37k 32890.54k 32975.53k
sha256 7230.65k 16954.37k 30372.69k 38017.37k 40896.98k
sha512 1718.56k 6809.92k 10142.99k 14130.85k 15767.55k
sign verify sign/s verify/s
rsa 2048 bits 0.029467s 0.000842s 33.9 1187.2
sign verify sign/s verify/s
dsa 2048 bits 0.007646s 0.009442s 130.8 105.9
Interestingly, you'll notice that the throughput on md5 and sha1 are much lower -- I believe this is the hardware offload capability. Here are the initial lines of output from the openssl speed routine:
engine "cryptodev" set.
Doing md5 for 3s on 16 size blocks: 188651 md5's in 0.73s
Doing md5 for 3s on 64 size blocks: 184145 md5's in 0.75s
Doing md5 for 3s on 256 size blocks: 174380 md5's in 0.76s
Doing md5 for 3s on 1024 size blocks: 143405 md5's in 0.66s
Doing md5 for 3s on 8192 size blocks: 50839 md5's in 0.24s
Doing sha1 for 3s on 16 size blocks: 189630 sha1's in 0.68s
Doing sha1 for 3s on 64 size blocks: 182671 sha1's in 0.69s
Doing sha1 for 3s on 256 size blocks: 164519 sha1's in 0.55s
Doing sha1 for 3s on 1024 size blocks: 121040 sha1's in 0.39s
Doing sha1 for 3s on 8192 size blocks: 33530 sha1's in 0.14s
Doing sha256 for 3s on 16 size blocks: 1351227 sha256's in 2.99s
I think what this is saying is that for the first line (for example) for 3s of computation time only 0.73s of CPU time is being used -- the HW engine is offloading that work.
The CPU usage stays very low during the md5 and sha1 phases of the routine, then spikes up to high values and 100% during the rest of the routine. So the HW offload keeps the CPU low but has lower throughput than the actual software rotuines.
I'm doing all these tests on live systems so there'll be some variability. However, I haven't really seen a case where my CPU is at max so I'm not sure I want to enable the crypto engine.
Similar results on the V1 by the way, I'm not pasting here to keep an already long post from getting even longer. Throughput on 16 byte md5/sha1 were reduced to 785k and 718k respectively, so much worse than original.
Also, there's a newer patch addressing all versions of hardware:
https://git.kernel.org/cgit/linux/kerne … 0d34454af7
Here's my version (4.4.6 kernel):
See kernel/git/next/linux-next.git commit 6e3695a741eca510f2517b7834bf790d34454af7
---
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -731,6 +731,7 @@ CONFIG_LOCKUP_DETECTOR=y
CONFIG_CRYPTO_DEV_TEGRA_AES=y
CONFIG_CPUFREQ_DT=y
CONFIG_KEYSTONE_IRQ=y
+CONFIG_CRYPTO_DEV_MARVELL_CESA=m
CONFIG_CRYPTO_DEV_SUN4I_SS=m
CONFIG_CRYPTO_DEV_ROCKCHIP=m
CONFIG_ARM_CRYPTO=y
--- a/arch/arm/configs/mvebu_v5_defconfig
+++ b/arch/arm/configs/mvebu_v5_defconfig
@@ -184,9 +184,9 @@ CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_FTRACE is not set
CONFIG_DEBUG_USER=y
+CONFIG_CRYPTO_DEV_MARVELL_CESA=y
CONFIG_CRYPTO_CBC=m
CONFIG_CRYPTO_PCBC=m
# CONFIG_CRYPTO_ANSI_CPRNG is not set
-CONFIG_CRYPTO_DEV_MV_CESA=y
CONFIG_CRC_CCITT=y
CONFIG_LIBCRC32C=y
--- a/arch/arm/configs/mvebu_v7_defconfig
+++ b/arch/arm/configs/mvebu_v7_defconfig
@@ -154,3 +154,4 @@ CONFIG_MAGIC_SYSRQ=y
CONFIG_TIMER_STATS=y
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_DEBUG_USER=y
+CONFIG_CRYPTO_DEV_MARVELL_CESA=y