summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-041-6/+6
|
* [X86] X86SpeculativeLoadHardeningPass::canHardenRegister - fix out of bounds ↵Simon Pilgrim2019-09-051-2/+5
| | | | | | | | warning. Fixes clang static-analyzer warning. llvm-svn: 371050
* Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-151-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
* Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders2019-08-011-2/+2
| | | | | | llvm::Register as started by r367614. NFC llvm-svn: 367633
* X86: Clean up pass initializationTom Stellard2019-06-131-4/+1
| | | | | | | | | | | | | | | | | | | | Summary: - Remove redundant initializations from pass constructors that were already being initialized by LLVMInitializeX86Target(). - Add initialization function for the FPS pass. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63218 llvm-svn: 363221
* [X86] Use OR32mi8Locked instead of LOCK_OR32mi8 in emitLockedStackOp.Craig Topper2019-05-151-3/+1
| | | | | | | | | | | They encode the same way, but OR32mi8Locked sets hasUnmodeledSideEffects set which should be stronger than the mayLoad/mayStore on LOCK_OR32mi8. I think this makes sense since we are using it as a fence. This also seems to hide the operation from the speculative load hardening pass so I've reverted r360511. llvm-svn: 360747
* [X86] Add a test case for idempotent atomic operations with speculative load ↵Craig Topper2019-05-111-1/+3
| | | | | | | | hardening. Fix an additional issue found by the test. This test covers the fix from r360475 as well. llvm-svn: 360511
* Factor out redzone ABI checks [NFCI]Philip Reames2019-05-101-1/+1
| | | | | | | | | | As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we *use* the redzone (via a particularly optimization?), but there's no common way to check whether the function *has* a red zone. I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated. Differential Revision: https://reviews.llvm.org/D61799 llvm-svn: 360479
* [X86] Disable speculative load hardening for operations with an explicit RSP ↵Craig Topper2019-05-101-0/+8
| | | | | | | | | | | | | | base. After D58632, we can create idempotent atomic operations to the top of stack. This confused speculative load hardening because it thinks accesses should have virtual register base except for the cases it already excluded. This commit adds a new exclusion for this case. I'll try to reduce a test case for this, but this fix was verified to work by the reporter. This should avoid needing to revert D58632. llvm-svn: 360475
* [X86] Merge the different Jcc instructions for each condition code into ↵Craig Topper2019-04-051-2/+2
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802
* [X86] Merge the different CMOV instructions for each condition code into ↵Craig Topper2019-04-051-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SLH] Fix a nasty bug in SLH.Chandler Carruth2018-12-051-1/+3
| | | | | | | | | | | | | | | | | | | | Whenever we effectively take the address of a basic block we need to manually update that basic block to reflect that fact or later passes such as tail duplication and tail merging can break the invariants of the code. =/ Sadly, there doesn't appear to be any good way of automating this or even writing a reasonable assert to catch it early. The change seems trivially and obviously correct, but sadly the only really good test case I have is 1000s of basic blocks. I've tried directly writing a test case that happens to make tail duplication do something that crashes later on, but this appears to require an *amazingly* complex set of conditions that I've not yet reproduced. The change is technically covered by the tests because we mark the blocks as having their address taken, but that doesn't really count as properly testing the functionality. llvm-svn: 348374
* X86: Consistently declare pass initializers in X86.h; NFCMatthias Braun2018-11-011-6/+0
| | | | | | | This avoids declaring them twice: in X86TargetMachine.cpp and the file implementing the pass. llvm-svn: 345801
* Revert r345165 "[X86] Bring back the MOV64r0 pseudo instruction"Craig Topper2018-10-311-2/+8
| | | | | | Google is reporting regressions on some benchmarks. llvm-svn: 345785
* [X86] Bring back the MOV64r0 pseudo instructionCraig Topper2018-10-241-8/+2
| | | | | | | | | | | | This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos. My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at. There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling. Differential Revision: https://reviews.llvm.org/D52757 llvm-svn: 345165
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-271-1/+1
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [x86/SLH] Add a real Clang flag and LLVM IR attribute for SpeculativeChandler Carruth2018-09-041-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Load Hardening. Wires up the existing pass to work with a proper IR attribute rather than just a hidden/internal flag. The internal flag continues to work for now, but I'll likely remove it soon. Most of the churn here is adding the IR attribute. I talked about this Kristof Beyls and he seemed at least initially OK with this direction. The idea of using a full attribute here is that we *do* expect at least some forms of this for other architectures. There isn't anything *inherently* x86-specific about this technique, just that we only have an implementation for x86 at the moment. While we could potentially expose this as a Clang-level attribute as well, that seems like a good question to defer for the moment as it isn't 100% clear whether that or some other programmer interface (or both?) would be best. We'll defer the programmer interface side of this for now, but at least get to the point where the feature can be enabled without relying on implementation details. This also allows us to do something that was really hard before: we can enable *just* the indirect call retpolines when using SLH. For x86, we don't have any other way to mitigate indirect calls. Other architectures may take a different approach of course, and none of this is surfaced to user-level flags. Differential Revision: https://reviews.llvm.org/D51157 llvm-svn: 341363
* [x86/SLH] Teach SLH to harden against the "ret2spec" attack byChandler Carruth2018-09-041-32/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | implementing the proposed mitigation technique described in the original design document. The idea is to check after calls that the return address used to arrive at that location is in fact the correct address. In the event of a mis-predicted return which reaches a *valid* return but not the *correct* return, this will detect the mismatch much like it would a mispredicted conditional branch. This is the last published attack vector that I am aware of in the Spectre v1 space which is not mitigated by SLH+retpolines. However, don't read *too* much into that: this is an area of ongoing research where we expect more issues to be discovered in the future, and it also makes no attempt to mitigate Spectre v4. Still, this is an important completeness bar for SLH. The change here is of course delightfully simple. It was predicated on cutting support for post-instruction symbols into LLVM which was not at all simple. Many thanks to Hal Finkel, Reid Kleckner, and Justin Bogner who helped me figure out how to do a bunch of the complex changes involved there. Differential Revision: https://reviews.llvm.org/D50837 llvm-svn: 341358
* [x86/SLH] Teach SLH to harden indirect branches and switches withoutChandler Carruth2018-09-041-3/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | retpolines. This implements the core design of tracing the intended target into the target, checking it, and using that to update the predicate state. It takes advantage of a few interesting aspects of SLH to make it a bit easier to implement: - We already split critical edges with conditional branches, so we can assume those are gone. - We already unfolded any memory access in the indirect branch instruction itself. I've left hard errors in place to catch if any of these somewhat subtle invariants get violated. There is some code that I can factor out and share with D50837 when it lands, but I didn't want to couple landing the two patches, so I'll do that in a follow-up cleanup commit if alright. Factoring out the code to handle different scenarios of materializing an address remains frustratingly hard. In a bunch of cases you want to fold one of the cases into an immediate operand of some other instruction, and you also have both symbols and basic blocks being used which require different methods on the MI builder (and different operand kinds). Still, I'll take a stab at sharing at least some of this code in a follow-up if I can figure out how. Differential Revision: https://reviews.llvm.org/D51083 llvm-svn: 341356
* [x86] Actually initialize the SLH pass with the x86 backend and useChandler Carruth2018-08-161-3/+3
| | | | | | | | | | | | | | | a shorter name ('x86-slh') for the internal flags and pass name. Without this, you can't use the -stop-after or -stop-before infrastructure. I seem to have just missed this when originally adding the pass. The shorter name solves two problems. First, the flag names were ... really long and hard to type/manage. Second, the pass name can't be the exact same as the flag name used to enable this, and there are already some users of that flag name so I'm avoiding changing it unnecessarily. llvm-svn: 339836
* [x86/SLH] Extract the logic to trace predicate state through calls toChandler Carruth2018-07-261-19/+39
| | | | | | | | | | a helper function with a nice overview comment. NFC. This is a preperatory refactoring to implementing another component of mitigation here that was descibed in the design document but hadn't been implemented yet. llvm-svn: 338016
* [x86/SLH] Sink the return hardening into the main block-walk + hardeningChandler Carruth2018-07-251-26/+17
| | | | | | | | | | | code. This consolidates all our hardening calls, and simplifies the code a bit. It seems much more clear to handle all of these together. No functionality changed here. llvm-svn: 337895
* [x86/SLH] Improve name and comments for the main hardening function.Chandler Carruth2018-07-251-174/+190
| | | | | | | | | | | | | | | | | | | This function actually does two things: it traces the predicate state through each of the basic blocks in the function (as that isn't directly handled by the SSA updater) *and* it hardens everything necessary in the block as it goes. These need to be done together so that we have the currently active predicate state to use at each point of the hardening. However, this also made obvious that the flag to disable actual hardening of loads was flawed -- it also disabled tracing the predicate state across function calls within the body of each block. So this patch sinks this debugging flag test to correctly guard just the hardening of loads. Unless load hardening was disabled, no functionality should change with tis patch. llvm-svn: 337894
* [x86/SLH] Teach the x86 speculative load hardening pass to hardenChandler Carruth2018-07-251-0/+200
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in *any way* mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878
* [x86/SLH] Extract the core register hardening logic to a low-levelChandler Carruth2018-07-241-36/+73
| | | | | | | | | | | | | | | | | helper and restructure the post-load hardening to use this. This isn't as trivial as I would have liked because the post-load hardening used a trick that only works for it where it swapped in a temporary register to the load rather than replacing anything. However, there is a simple way to do this without that trick that allows this to easily reuse a friendly API for hardening a value in a register. That API will in turn be usable in subsequent patcehs. This also techincally changes the position at which we insert the subreg extraction for the predicate state, but that never resulted in an actual instruction and so tests don't change at all. llvm-svn: 337825
* [x86/SLH] Tidy up a comment, using doxygen structure and wording it toChandler Carruth2018-07-241-5/+7
| | | | | | be more accurate and understandable. llvm-svn: 337822
* [x86/SLH] Simplify the code for hardening a loaded value. NFC.Chandler Carruth2018-07-241-20/+15
| | | | | | | This is in preparation for extracting this into a re-usable utility in this code. llvm-svn: 337785
* [x86/SLH] Remove complex SHRX-based post-load hardening.Chandler Carruth2018-07-241-73/+10
| | | | | | | | | | | | | | | | | | | This code was really nasty, had several bugs in it originally, and wasn't carrying its weight. While on Zen we have all 4 ports available for SHRX, on all of the Intel parts with Agner's tables, SHRX can only execute on 2 ports, giving it 1/2 the throughput of OR. Worse, all too often this pattern required two SHRX instructions in a chain, hurting the critical path by a lot. Even if we end up needing to safe/restore EFLAGS, that is no longer so bad. We pay for a uop to save the flag, but we very likely get fusion when it is used by forming a test/jCC pair or something similar. In practice, I don't expect the SHRX to be a significant savings here, so I'd like to avoid the complex code required. We can always resurrect this if/when someone has a specific performance issue addressed by it. llvm-svn: 337781
* [x86/SLH] Fix a bug where we would harden tail calls twice -- once asChandler Carruth2018-07-231-1/+5
| | | | | | | | | a call, and then again as a return. Also added a comment to try and explain better why we would be doing what we're doing when hardening the (non-call) returns. llvm-svn: 337673
* [x86/SLH] Rename and comment the main hardening function. NFC.Chandler Carruth2018-07-231-4/+21
| | | | | | | | | | This provides an overview of the algorithm used to harden specific loads. It also brings this our terminology further in line with hardening rather than checking. Differential Revision: https://reviews.llvm.org/D49583 llvm-svn: 337667
* [x86/SLH] Clean up helper naming for return instruction handling andChandler Carruth2018-07-191-4/+26
| | | | | | | | | | | | | | remove dead declaration of a call instruction handling helper. This moves to the 'harden' terminology that I've been trying to settle on for returns. It also adds a really detailed comment explaining what all we're trying to accomplish with return instructions and why. Hopefully this makes it much more clear what exactly is being "hardened". Differential Revision: https://reviews.llvm.org/D49571 llvm-svn: 337510
* [x86/SLH] Major refactoring of SLH implementaiton. There are two bigChandler Carruth2018-07-191-174/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | changes that are intertwined here: 1) Extracting the tracing of predicate state through the CFG to its own function. 2) Creating a struct to manage the predicate state used throughout the pass. Doing #1 necessitates and motivates the particular approach for #2 as now the predicate management is spread across different functions focused on different aspects of it. A number of simplifications then fell out as a direct consequence. I went with an Optional to make it more natural to construct the MachineSSAUpdater object. This is probably the single largest outstanding refactoring step I have. Things get a bit more surgical from here. My current goal, beyond generally making this maintainable long-term, is to implement several improvements to how we do interprocedural tracking of predicate state. But I don't want to do that until the predicate state management and tracing is in reasonably clear state. Differential Revision: https://reviews.llvm.org/D49427 llvm-svn: 337446
* [x86/SLH] Flesh out the data-invariant instruction table a bit based on ↵Chandler Carruth2018-07-171-7/+37
| | | | | | | | | | | | | | | | | | | | | | | | | feedback from Craig. Summary: The only thing he suggested that I've skipped here is the double-wide multiply instructions. Multiply is an area I'm nervous about there being some hidden data-dependent behavior, and it doesn't seem important for any benchmarks I have, so skipping it and sticking with the minimal multiply support that matches what I know is widely used in existing crypto libraries. We can always add double-wide multiply when we have clarity from vendors about its behavior and guarantees. I've tried to at least cover the fundamentals here with tests, although I've not tried to cover every width or permutation. I can add more tests where folks think it would be helpful. Reviewers: craig.topper Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49413 llvm-svn: 337308
* [x86/SLH] Completely rework how we sink post-load hardening past dataChandler Carruth2018-07-161-24/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | invariant instructions to be both more correct and much more powerful. While testing, I continued to find issues with sinking post-load hardening. Unfortunately, it was amazingly hard to create any useful tests of this because we were mostly sinking across copies and other loading instructions. The fact that we couldn't sink past normal arithmetic was really a big oversight. So first, I've ported roughly the same set of instructions from the data invariant loads to also have their non-loading varieties understood to be data invariant. I've also added a few instructions that came up so often it again made testing complicated: inc, dec, and lea. With this, I was able to shake out a few nasty bugs in the validity checking. We need to restrict to hardening single-def instructions with defined registers that match a particular form: GPRs that don't have a NOREX constraint directly attached to their register class. The (tiny!) test case included catches all of the issues I was seeing (once we can sink the hardening at all) except for the NOREX issue. The only test I have there is horrible. It is large, inexplicable, and doesn't even produce an error unless you try to emit encodings. I can keep looking for a way to test it, but I'm out of ideas really. Thanks to Ben for giving me at least a sanity-check review. I'll follow up with Craig to go over this more thoroughly post-commit, but without it SLH crashes everywhere so landing it for now. Differential Revision: https://reviews.llvm.org/D49378 llvm-svn: 337177
* [x86/SLH] Fix a bug where we would try to post-load harden non-GPRs.Chandler Carruth2018-07-161-13/+25
| | | | | | | | | | | | | | | | Found cases that hit the assert I added. This patch factors the validity checking into a nice helper routine and calls it when deciding to harden post-load, and asserts it when doing so later. I've added tests for the various ways of loading a floating point type, as well as loading all vector permutations. Even though many of these go to identical instructions, it seems good to somewhat comprehensively test them. I'm confident there will be more fixes needed here, I'll try to add tests each time as I get this predicate adjusted. llvm-svn: 337160
* [x86/SLH] Extract another small helper function, add better comments andChandler Carruth2018-07-161-23/+34
| | | | | | use better terminology. NFC. llvm-svn: 337157
* [x86/SLH] Fix an unused variable warning in release builds afterChandler Carruth2018-07-161-0/+1
| | | | | | r337144. llvm-svn: 337145
* [x86/SLH] Teach speculative load hardening to correctly harden theChandler Carruth2018-07-161-16/+91
| | | | | | | | | | | | | | | | | | | | | | indices used by AVX2 and AVX-512 gather instructions. The index vector is hardened by broadcasting the predicate state into a vector register and then or-ing. We don't even have to worry about EFLAGS here. I've added a test for all of the gather intrinsics to make sure that we don't miss one. A particularly interesting creation is the gather prefetch, which needs to be marked as potentially "loading" to get the correct behavior. It's a memory access in many ways, and is actually relevant for SLH. Based on discussion with Craig in review, I've moved it to be `mayLoad` and `mayStore` rather than generic side effects. This matches how we model other prefetch instructions. Many thanks to Craig for the review here. Differential Revision: https://reviews.llvm.org/D49336 llvm-svn: 337144
* [x86/SLH] Extract one of the bits of logic to its own function. NFC.Chandler Carruth2018-07-151-43/+48
| | | | | | | This is just a refactoring to start cleaning up the code here and make it more readable and approachable. llvm-svn: 337138
* [x86/SLH] Fix an issue where we wouldn't harden any loads if we foundChandler Carruth2018-07-141-3/+3
| | | | | | | | | no conditions. This is only valid to do if we're hardening calls and rets with LFENCE which results in an LFENCE guarding the entire entry block for us. llvm-svn: 337089
* [x86/SLH] Add an assert to catch if we ever end up trying to hardenChandler Carruth2018-07-141-0/+8
| | | | | | post-load a register that isn't valid for use with OR or SHRX. llvm-svn: 337078
* [X86][SLH] Remove PDEP and PEXT from isDataInvariantLoadCraig Topper2018-07-131-4/+0
| | | | | | | | | | | | Ryzen has something like an 18 cycle latency on these based on Agner's data. AMD's own xls is blank. So it seems like there might be something tricky here. Agner's data for Intel CPUs indicates these are a single uop there. Probably safest to remove them. We never generate them without an intrinsic so this should be ok. Differential Revision: https://reviews.llvm.org/D49315 llvm-svn: 337067
* [X86][SLH] Add VEX and EVEX conversion instructions to isDataInvariantLoadCraig Topper2018-07-131-13/+19
| | | | | | | | | | -Drop the intrinsic versions of conversion instructions. These should be handled when we do vectors. They shouldn't show up in scalar code. -Add the float<->double conversions which were missing. -Add the AVX512 and AVX version of the conversion instructions including the unsigned integer conversions unique to AVX512 Differential Revision: https://reviews.llvm.org/D49313 llvm-svn: 337066
* [X86][SLH] Regroup the instructions in isDataInvariantLoad a little. NFCCraig Topper2018-07-131-36/+43
| | | | | | | | | | -Move BSF/BSR to the same group as TZCNT/LZCNT/POPCNT. -Split some of the bit manipulation instructions away from TZCNT/LZCNT/POPCNT. These are things like 'x & (x - 1)' which are composed of a few simple arithmetic operations. These aren't nearly as complicated/surprising as counting bits. -Move BEXTR/BZHI into their own group. They aren't like a simple arithmethic op or the bit manipulation instructions. They're more like a shift+and. Differential Revision: https://reviews.llvm.org/D49312 llvm-svn: 337065
* Add parens to silence Wparentheses warning, introduced by 336990Erich Keane2018-07-131-5/+3
| | | | llvm-svn: 337002
* [SLH] Introduce a new pass to do Speculative Load Hardening to mitigateChandler Carruth2018-07-131-0/+1667
Spectre variant #1 for x86. There is a lengthy, detailed RFC thread on llvm-dev which discusses the high level issues. High level discussion is probably best there. I've split the design document out of this patch and will land it separately once I update it to reflect the latest edits and updates to the Google doc used in the RFC thread. This patch is really just an initial step. It isn't quite ready for prime time and is only exposed via debugging flags. It has two major limitations currently: 1) It only supports x86-64, and only certain ABIs. Many assumptions are currently hard-coded and need to be factored out of the code here. 2) It doesn't include any options for more fine-grained control, either of which control flow edges are significant or which loads are important to be hardened. 3) The code is still quite rough and the testing lighter than I'd like. However, this is enough for people to begin using. I have had numerous requests from people to be able to experiment with this patch to understand the trade-offs it presents and how to use it. We would also like to encourage work to similar effect in other toolchains. The ARM folks are actively developing a system based on this for AArch64. We hope to merge this with their efforts when both are far enough along. But we also don't want to block making this available on that effort. Many thanks to the *numerous* people who helped along the way here. For this patch in particular, both Eric and Craig did a ton of review to even have confidence in it as an early, rough cut at this functionality. Differential Revision: https://reviews.llvm.org/D44824 llvm-svn: 336990
OpenPOWER on IntegriCloud