| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, gf128mul_x_ble works with pointers to be128, even though it
actually interprets the words as little-endian. Consequently, it uses
cpu_to_le64/le64_to_cpu on fields of type __be64, which is incorrect.
This patch fixes that by changing the function to accept pointers to
le128 and updating all users accordingly.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Reviewd-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch centralizes the XTS key check logic into the service function
xts_check_key which is invoked from the different XTS implementations.
With this, the XTS implementations in ARM, ARM64, PPC and S390 have now
a sanity check for the XTS keys similar to the other arches.
In addition, this service function received a check to ensure that the
key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the
check is not present in the standards defining XTS, it is only enforced
in FIPS mode of the kernel.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
| |
This prefixes all crypto module loading with "crypto-" so we never run
the risk of exposing module auto-loading to userspace via a crypto API,
as demonstrated by Mathias Krause:
https://lkml.org/lkml/2013/3/4/70
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
| |
'u128' currently used for CTR mode is on little-endian 'long long' swapped
and would require extra swap operations by SSE/AVX code. Use of le128
instead of u128 allows IV calculations to be done with vector registers
easier.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
| |
Initialization of cra_list is currently mixed, most ciphers initialize this
field and most shashes do not. Initialization however is not needed at all
since cra_list is initialized/overwritten in __crypto_register_alg() with
list_add(). Therefore perform cleanup to remove all unneeded initializations
of this field in 'arch/x86/crypto/'.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
| |
from glue_helper
Now that shared glue code is available, convert twofish-avx to use it.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
| |
glue code from glue_helper
Now that shared glue code is available, convert twofish-x86_64-3way to use it.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a x86_64/avx assembler implementation of the Twofish block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way
module are called. A good performance increase is provided for blocksizes
greater or equal to 128B.
Patch has been tested with tcrypt and automated filesystem tests.
Tcrypt benchmark results:
Intel Core i5-2500 CPU (fam:6, model:42, step:7)
twofish-avx-x86_64 vs. twofish-x86_64-3way
128bit key: (lrw:256bit) (xts:256bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.96x 0.97x 1.00x 0.95x 0.97x 0.97x 0.96x 0.95x 0.95x 0.98x
64B 0.99x 0.99x 1.00x 0.99x 0.98x 0.98x 0.99x 0.98x 0.99x 0.98x
256B 1.20x 1.21x 1.00x 1.19x 1.15x 1.14x 1.19x 1.20x 1.18x 1.19x
1024B 1.29x 1.30x 1.00x 1.28x 1.23x 1.24x 1.26x 1.28x 1.26x 1.27x
8192B 1.31x 1.32x 1.00x 1.31x 1.25x 1.25x 1.28x 1.29x 1.28x 1.30x
256bit key: (lrw:384bit) (xts:512bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.96x 0.96x 1.00x 0.96x 0.97x 0.98x 0.95x 0.95x 0.95x 0.96x
64B 1.00x 0.99x 1.00x 0.98x 0.98x 1.01x 0.98x 0.98x 0.98x 0.98x
256B 1.20x 1.21x 1.00x 1.21x 1.15x 1.15x 1.19x 1.20x 1.18x 1.19x
1024B 1.29x 1.30x 1.00x 1.28x 1.23x 1.23x 1.26x 1.27x 1.26x 1.27x
8192B 1.31x 1.33x 1.00x 1.31x 1.26x 1.26x 1.29x 1.29x 1.28x 1.30x
twofish-avx-x86_64 vs aes-asm (8kB block):
128bit 256bit
ecb-enc 1.19x 1.63x
ecb-dec 1.18x 1.62x
cbc-enc 0.75x 1.03x
cbc-dec 1.23x 1.67x
ctr-enc 1.24x 1.65x
ctr-dec 1.24x 1.65x
lrw-enc 1.15x 1.53x
lrw-dec 1.14x 1.52x
xts-enc 1.16x 1.56x
xts-dec 1.16x 1.56x
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
| |
This caused conflict with camellia-x86_64 when compiled into kernel, same
function names and not static.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
Combine all crypto_alg to be registered and use new crypto_[un]register_algs
functions. Simplifies init/exit code and reduce object size.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
We can never reach the line just after the 'return 0'
statement. Remove it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
Performance of twofish-x86_64-3way on Intel Pentium 4 and Atom is lower than
of twofish-x86_64 module. So blacklist these CPUs.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
Since LRW & XTS are selected by twofish-x86_64-3way, we don't need these
#ifdefs anymore.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch adds XTS support for twofish-x86_64-3way by using xts_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.
Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):
Intel Celeron T1600 (fam:6, model:15, step:13):
size xts-enc xts-dec
16B 0.98x 1.00x
64B 1.14x 1.15x
256B 1.23x 1.25x
1024B 1.26x 1.29x
8192B 1.28x 1.30x
AMD Phenom II 1055T (fam:16, model:10):
size xts-enc xts-dec
16B 1.03x 1.03x
64B 1.13x 1.16x
256B 1.20x 1.20x
1024B 1.22x 1.22x
8192B 1.22x 1.21x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch adds LRW support for twofish-x86_64-3way by using lrw_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.
Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):
Intel Celeron T1600 (fam:6, model:15, step:13):
size lrw-enc lrw-dec
16B 0.99x 1.00x
64B 1.17x 1.17x
256B 1.26x 1.27x
1024B 1.30x 1.31x
8192B 1.31x 1.32x
AMD Phenom II 1055T (fam:16, model:10):
size lrw-enc lrw-dec
16B 1.06x 1.01x
64B 1.08x 1.14x
256B 1.19x 1.20x
1024B 1.21x 1.22x
8192B 1.23x 1.24x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
| |
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Patch adds 3-way parallel x86_64 assembly implementation of twofish as new
module. New assembler functions crypt data in three blocks chunks, improving
cipher performance on out-of-order CPUs.
Patch has been tested with tcrypt and automated filesystem tests.
Summary of the tcrypt benchmarks:
Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB)
encrypt: 1.3x speed
decrypt: 1.3x speed
Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC)
encrypt: 1.07x speed
decrypt: 1.4x speed
Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR)
encrypt: 1.4x speed
Twofish 3-way-asm vs AES asm (128bit 8kb block ECB)
encrypt: 1.0x speed
decrypt: 1.0x speed
Twofish 3-way-asm vs AES asm (128bit 8kb block CBC)
encrypt: 0.84x speed
decrypt: 1.09x speed
Twofish 3-way-asm vs AES asm (128bit 8kb block CTR)
encrypt: 1.15x speed
Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt
Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor
Also userspace test were run on:
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU E7330 @ 2.40GHz
stepping : 11
Userspace test results:
Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II:
encrypt: 1.27x
decrypt: 1.25x
Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330:
encrypt: 1.36x
decrypt: 1.36x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|