summaryrefslogtreecommitdiffstats
path: root/compiler-rt/lib/scudo/scudo_crc32.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [scudo] CRC32 optimizationsKostya Kortchinsky2017-05-091-18/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change optimizes several aspects of the checksum used for chunk headers. First, there is no point in checking the weak symbol `computeHardwareCRC32` everytime, it will either be there or not when we start, so check it once during initialization and set the checksum type accordingly. Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was not optimized out, while not necessary. So I reshuffled that part of the code, which duplicates a tiny bit of code, but ends up in a much cleaner assembly (and faster as we avoid an extraneous load and some calls). The following code is the checksum at the end of `scudoMalloc` for x86_64 with full SSE 4.2, before: ``` mov rax, 0FFFFFFFFFFFFFFh shl r10, 38h mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie and r14, rax lea rsi, [r13-10h] movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm or r14, r10 mov rbx, r14 xor bx, bx call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov rsi, rbx mov edi, eax call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov r14w, ax mov rax, r13 mov [r13-10h], r14 ``` After: ``` mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie lea rcx, [rbx-10h] mov rdx, 0FFFFFFFFFFFFFFh and r14, rdx shl r9, 38h or r14, r9 crc32 eax, rcx mov rdx, r14 xor dx, dx mov eax, eax crc32 eax, rdx mov r14w, ax mov rax, rbx mov [rbx-10h], r14 ``` Reviewers: dvyukov, alekseyshl, kcc Reviewed By: alekseyshl Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D32971 llvm-svn: 302538
* [scudo] Refactor of CRC32 and ARM runtime CRC32 detectionKostya Kortchinsky2017-01-181-14/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: ARM & AArch64 runtime detection for hardware support of CRC32 has been added via check of the AT_HWVAL auxiliary vector. Following Michal's suggestions in D28417, the CRC32 code has been further changed and looks better now. When compiled with full relro (which is strongly suggested to benefit from additional hardening), the weak symbol for computeHardwareCRC32 is read-only and the assembly generated is fairly clean and straight forward. As suggested, an additional optimization is to skip the runtime check if SSE 4.2 has been enabled globally, as opposed to only for scudo_crc32.cpp. scudo_crc32.h has no purpose anymore and was removed. Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek Reviewed By: rengolin, mgorny Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28574 llvm-svn: 292409
* [scudo] Separate hardware CRC32 routinesKostya Kortchinsky2017-01-101-0/+53
Summary: As raised in D28304, enabling SSE 4.2 for the whole Scudo tree leads to the emission of SSE 4.2 instructions everywhere, while the runtime checks only applied to the CRC32 computing function. This patch separates the CRC32 function taking advantage of the hardware into its own file, and only enabled -msse4.2 for that file, if detected to be supported by the compiler. Another consequence of removing SSE4.2 globally is realizing that memcpy were not being optimized, which turned out to be due to the -fno-builtin in SANITIZER_COMMON_CFLAGS. So we now explicitely enable builtins for Scudo. The resulting assembly looks good, with some CALLs are introduced instead of the CRC32 code being inlined. Reviewers: kcc, mgorny, alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28417 llvm-svn: 291570
OpenPOWER on IntegriCloud