| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
warning.
Fixes clang static-analyzer warning.
llvm-svn: 371050
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
|
|
|
|
|
|
| |
llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Remove redundant initializations from pass constructors that were
already being initialized by LLVMInitializeX86Target().
- Add initialization function for the FPS pass.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63218
llvm-svn: 363221
|
|
|
|
|
|
|
|
|
|
|
| |
They encode the same way, but OR32mi8Locked sets hasUnmodeledSideEffects set
which should be stronger than the mayLoad/mayStore on LOCK_OR32mi8. I think
this makes sense since we are using it as a fence.
This also seems to hide the operation from the speculative load hardening pass
so I've reverted r360511.
llvm-svn: 360747
|
|
|
|
|
|
|
|
| |
hardening. Fix an additional issue found by the test.
This test covers the fix from r360475 as well.
llvm-svn: 360511
|
|
|
|
|
|
|
|
|
|
| |
As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we *use* the redzone (via a particularly optimization?), but there's no common way to check whether the function *has* a red zone.
I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated.
Differential Revision: https://reviews.llvm.org/D61799
llvm-svn: 360479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
base.
After D58632, we can create idempotent atomic operations to the top of stack.
This confused speculative load hardening because it thinks accesses should have
virtual register base except for the cases it already excluded.
This commit adds a new exclusion for this case. I'll try to reduce a test case
for this, but this fix was verified to work by the reporter. This should avoid
needing to revert D58632.
llvm-svn: 360475
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
single instructions that store the condition code as an operand.
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes.
Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.
Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon
Reviewed By: RKSimon
Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60228
llvm-svn: 357802
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
single instructions that store the condition code as an immediate.
Summary:
Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models.
This avoids needing an isel pattern for each condition code. And it removes
translation switches for converting between CMOV instructions and condition
codes.
Now the printer, encoder and disassembler take care of converting the immediate.
We use InstAliases to handle the assembly matching. But we print using the
asm string in the instruction definition. The instruction itself is marked
IsCodeGenOnly=1 to hide it from the assembly parser.
This does complicate the scheduler models a little since we can't assign the
A and BE instructions to a separate class now.
I plan to make similar changes for SETcc and Jcc.
Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet
Reviewed By: RKSimon
Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60041
llvm-svn: 357800
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whenever we effectively take the address of a basic block we need to
manually update that basic block to reflect that fact or later passes
such as tail duplication and tail merging can break the invariants of
the code. =/ Sadly, there doesn't appear to be any good way of
automating this or even writing a reasonable assert to catch it early.
The change seems trivially and obviously correct, but sadly the only
really good test case I have is 1000s of basic blocks. I've tried
directly writing a test case that happens to make tail duplication do
something that crashes later on, but this appears to require an
*amazingly* complex set of conditions that I've not yet reproduced.
The change is technically covered by the tests because we mark the
blocks as having their address taken, but that doesn't really count as
properly testing the functionality.
llvm-svn: 348374
|
|
|
|
|
|
|
| |
This avoids declaring them twice: in X86TargetMachine.cpp and the file
implementing the pass.
llvm-svn: 345801
|
|
|
|
|
|
| |
Google is reporting regressions on some benchmarks.
llvm-svn: 345785
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos.
My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at.
There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling.
Differential Revision: https://reviews.llvm.org/D52757
llvm-svn: 345165
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb
Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D52573
llvm-svn: 343163
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Load Hardening.
Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.
Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.
While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.
This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.
Differential Revision: https://reviews.llvm.org/D51157
llvm-svn: 341363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
implementing the proposed mitigation technique described in the original
design document.
The idea is to check after calls that the return address used to arrive
at that location is in fact the correct address. In the event of
a mis-predicted return which reaches a *valid* return but not the
*correct* return, this will detect the mismatch much like it would
a mispredicted conditional branch.
This is the last published attack vector that I am aware of in the
Spectre v1 space which is not mitigated by SLH+retpolines. However,
don't read *too* much into that: this is an area of ongoing research
where we expect more issues to be discovered in the future, and it also
makes no attempt to mitigate Spectre v4. Still, this is an important
completeness bar for SLH.
The change here is of course delightfully simple. It was predicated on
cutting support for post-instruction symbols into LLVM which was not at
all simple. Many thanks to Hal Finkel, Reid Kleckner, and Justin Bogner
who helped me figure out how to do a bunch of the complex changes
involved there.
Differential Revision: https://reviews.llvm.org/D50837
llvm-svn: 341358
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
retpolines.
This implements the core design of tracing the intended target into the
target, checking it, and using that to update the predicate state. It
takes advantage of a few interesting aspects of SLH to make it a bit
easier to implement:
- We already split critical edges with conditional branches, so we can
assume those are gone.
- We already unfolded any memory access in the indirect branch
instruction itself.
I've left hard errors in place to catch if any of these somewhat subtle
invariants get violated.
There is some code that I can factor out and share with D50837 when it
lands, but I didn't want to couple landing the two patches, so I'll do
that in a follow-up cleanup commit if alright.
Factoring out the code to handle different scenarios of materializing an
address remains frustratingly hard. In a bunch of cases you want to fold
one of the cases into an immediate operand of some other instruction,
and you also have both symbols and basic blocks being used which require
different methods on the MI builder (and different operand kinds).
Still, I'll take a stab at sharing at least some of this code in
a follow-up if I can figure out how.
Differential Revision: https://reviews.llvm.org/D51083
llvm-svn: 341356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a shorter name ('x86-slh') for the internal flags and pass name.
Without this, you can't use the -stop-after or -stop-before
infrastructure. I seem to have just missed this when originally adding
the pass.
The shorter name solves two problems. First, the flag names were ...
really long and hard to type/manage. Second, the pass name can't be the
exact same as the flag name used to enable this, and there are already
some users of that flag name so I'm avoiding changing it unnecessarily.
llvm-svn: 339836
|
|
|
|
|
|
|
|
|
|
| |
a helper function with a nice overview comment. NFC.
This is a preperatory refactoring to implementing another component of
mitigation here that was descibed in the design document but hadn't been
implemented yet.
llvm-svn: 338016
|
|
|
|
|
|
|
|
|
|
|
| |
code.
This consolidates all our hardening calls, and simplifies the code
a bit. It seems much more clear to handle all of these together.
No functionality changed here.
llvm-svn: 337895
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function actually does two things: it traces the predicate state
through each of the basic blocks in the function (as that isn't directly
handled by the SSA updater) *and* it hardens everything necessary in the
block as it goes. These need to be done together so that we have the
currently active predicate state to use at each point of the hardening.
However, this also made obvious that the flag to disable actual
hardening of loads was flawed -- it also disabled tracing the predicate
state across function calls within the body of each block. So this patch
sinks this debugging flag test to correctly guard just the hardening of
loads.
Unless load hardening was disabled, no functionality should change with
tis patch.
llvm-svn: 337894
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
against v1.2 BCBS attacks directly.
Attacks using spectre v1.2 (a subset of BCBS) are described in the paper
here:
https://people.csail.mit.edu/vlk/spectre11.pdf
The core idea is to speculatively store over the address in a vtable,
jumptable, or other target of indirect control flow that will be
subsequently loaded. Speculative execution after such a store can
forward the stored value to subsequent loads, and if called or jumped
to, the speculative execution will be steered to this potentially
attacker controlled address.
Up until now, this could be mitigated by enableing retpolines. However,
that is a relatively expensive technique to mitigate this particular
flavor. Especially because in most cases SLH will have already mitigated
this. To fully mitigate this with SLH, we need to do two core things:
1) Unfold loads from calls and jumps, allowing the loads to be post-load
hardened.
2) Force hardening of incoming registers even if we didn't end up
needing to harden the load itself.
The reason we need to do these two things is because hardening calls and
jumps from this particular variant is importantly different from
hardening against leak of secret data. Because the "bad" data here isn't
a secret, but in fact speculatively stored by the attacker, it may be
loaded from any address, regardless of whether it is read-only memory,
mapped memory, or a "hardened" address. The only 100% effective way to
harden these instructions is to harden the their operand itself. But to
the extent possible, we'd like to take advantage of all the other
hardening going on, we just need a fallback in case none of that
happened to cover the particular input to the control transfer
instruction.
For users of SLH, currently they are paing 2% to 6% performance overhead
for retpolines, but this mechanism is expected to be substantially
cheaper. However, it is worth reminding folks that this does not
mitigate all of the things retpolines do -- most notably, variant #2 is
not in *any way* mitigated by this technique. So users of SLH may still
want to enable retpolines, and the implementation is carefuly designed to
gracefully leverage retpolines to avoid the need for further hardening
here when they are enabled.
Differential Revision: https://reviews.llvm.org/D49663
llvm-svn: 337878
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
helper and restructure the post-load hardening to use this.
This isn't as trivial as I would have liked because the post-load
hardening used a trick that only works for it where it swapped in
a temporary register to the load rather than replacing anything.
However, there is a simple way to do this without that trick that allows
this to easily reuse a friendly API for hardening a value in a register.
That API will in turn be usable in subsequent patcehs.
This also techincally changes the position at which we insert the subreg
extraction for the predicate state, but that never resulted in an actual
instruction and so tests don't change at all.
llvm-svn: 337825
|
|
|
|
|
|
| |
be more accurate and understandable.
llvm-svn: 337822
|
|
|
|
|
|
|
| |
This is in preparation for extracting this into a re-usable utility in
this code.
llvm-svn: 337785
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code was really nasty, had several bugs in it originally, and
wasn't carrying its weight. While on Zen we have all 4 ports available
for SHRX, on all of the Intel parts with Agner's tables, SHRX can only
execute on 2 ports, giving it 1/2 the throughput of OR.
Worse, all too often this pattern required two SHRX instructions in
a chain, hurting the critical path by a lot.
Even if we end up needing to safe/restore EFLAGS, that is no longer so
bad. We pay for a uop to save the flag, but we very likely get fusion
when it is used by forming a test/jCC pair or something similar. In
practice, I don't expect the SHRX to be a significant savings here, so
I'd like to avoid the complex code required. We can always resurrect
this if/when someone has a specific performance issue addressed by it.
llvm-svn: 337781
|
|
|
|
|
|
|
|
|
| |
a call, and then again as a return.
Also added a comment to try and explain better why we would be doing
what we're doing when hardening the (non-call) returns.
llvm-svn: 337673
|
|
|
|
|
|
|
|
|
|
| |
This provides an overview of the algorithm used to harden specific
loads. It also brings this our terminology further in line with
hardening rather than checking.
Differential Revision: https://reviews.llvm.org/D49583
llvm-svn: 337667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remove dead declaration of a call instruction handling helper.
This moves to the 'harden' terminology that I've been trying to settle
on for returns. It also adds a really detailed comment explaining what
all we're trying to accomplish with return instructions and why.
Hopefully this makes it much more clear what exactly is being
"hardened".
Differential Revision: https://reviews.llvm.org/D49571
llvm-svn: 337510
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
changes that are intertwined here:
1) Extracting the tracing of predicate state through the CFG to its own
function.
2) Creating a struct to manage the predicate state used throughout the
pass.
Doing #1 necessitates and motivates the particular approach for #2 as
now the predicate management is spread across different functions
focused on different aspects of it. A number of simplifications then
fell out as a direct consequence.
I went with an Optional to make it more natural to construct the
MachineSSAUpdater object.
This is probably the single largest outstanding refactoring step I have.
Things get a bit more surgical from here. My current goal, beyond
generally making this maintainable long-term, is to implement several
improvements to how we do interprocedural tracking of predicate state.
But I don't want to do that until the predicate state management and
tracing is in reasonably clear state.
Differential Revision: https://reviews.llvm.org/D49427
llvm-svn: 337446
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
feedback from Craig.
Summary:
The only thing he suggested that I've skipped here is the double-wide
multiply instructions. Multiply is an area I'm nervous about there being
some hidden data-dependent behavior, and it doesn't seem important for
any benchmarks I have, so skipping it and sticking with the minimal
multiply support that matches what I know is widely used in existing
crypto libraries. We can always add double-wide multiply when we have
clarity from vendors about its behavior and guarantees.
I've tried to at least cover the fundamentals here with tests, although
I've not tried to cover every width or permutation. I can add more tests
where folks think it would be helpful.
Reviewers: craig.topper
Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D49413
llvm-svn: 337308
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invariant instructions to be both more correct and much more powerful.
While testing, I continued to find issues with sinking post-load
hardening. Unfortunately, it was amazingly hard to create any useful
tests of this because we were mostly sinking across copies and other
loading instructions. The fact that we couldn't sink past normal
arithmetic was really a big oversight.
So first, I've ported roughly the same set of instructions from the data
invariant loads to also have their non-loading varieties understood to
be data invariant. I've also added a few instructions that came up so
often it again made testing complicated: inc, dec, and lea.
With this, I was able to shake out a few nasty bugs in the validity
checking. We need to restrict to hardening single-def instructions with
defined registers that match a particular form: GPRs that don't have
a NOREX constraint directly attached to their register class.
The (tiny!) test case included catches all of the issues I was seeing
(once we can sink the hardening at all) except for the NOREX issue. The
only test I have there is horrible. It is large, inexplicable, and
doesn't even produce an error unless you try to emit encodings. I can
keep looking for a way to test it, but I'm out of ideas really.
Thanks to Ben for giving me at least a sanity-check review. I'll follow
up with Craig to go over this more thoroughly post-commit, but without
it SLH crashes everywhere so landing it for now.
Differential Revision: https://reviews.llvm.org/D49378
llvm-svn: 337177
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found cases that hit the assert I added. This patch factors the validity
checking into a nice helper routine and calls it when deciding to harden
post-load, and asserts it when doing so later.
I've added tests for the various ways of loading a floating point type,
as well as loading all vector permutations. Even though many of these go
to identical instructions, it seems good to somewhat comprehensively
test them.
I'm confident there will be more fixes needed here, I'll try to add
tests each time as I get this predicate adjusted.
llvm-svn: 337160
|
|
|
|
|
|
| |
use better terminology. NFC.
llvm-svn: 337157
|
|
|
|
|
|
| |
r337144.
llvm-svn: 337145
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
indices used by AVX2 and AVX-512 gather instructions.
The index vector is hardened by broadcasting the predicate state
into a vector register and then or-ing. We don't even have to worry
about EFLAGS here.
I've added a test for all of the gather intrinsics to make sure that we
don't miss one. A particularly interesting creation is the gather
prefetch, which needs to be marked as potentially "loading" to get the
correct behavior. It's a memory access in many ways, and is actually
relevant for SLH. Based on discussion with Craig in review, I've moved
it to be `mayLoad` and `mayStore` rather than generic side effects. This
matches how we model other prefetch instructions.
Many thanks to Craig for the review here.
Differential Revision: https://reviews.llvm.org/D49336
llvm-svn: 337144
|
|
|
|
|
|
|
| |
This is just a refactoring to start cleaning up the code here and make
it more readable and approachable.
llvm-svn: 337138
|
|
|
|
|
|
|
|
|
| |
no conditions.
This is only valid to do if we're hardening calls and rets with LFENCE
which results in an LFENCE guarding the entire entry block for us.
llvm-svn: 337089
|
|
|
|
|
|
| |
post-load a register that isn't valid for use with OR or SHRX.
llvm-svn: 337078
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ryzen has something like an 18 cycle latency on these based on Agner's data. AMD's own xls is blank. So it seems like there might be something tricky here.
Agner's data for Intel CPUs indicates these are a single uop there.
Probably safest to remove them. We never generate them without an intrinsic so this should be ok.
Differential Revision: https://reviews.llvm.org/D49315
llvm-svn: 337067
|
|
|
|
|
|
|
|
|
|
| |
-Drop the intrinsic versions of conversion instructions. These should be handled when we do vectors. They shouldn't show up in scalar code.
-Add the float<->double conversions which were missing.
-Add the AVX512 and AVX version of the conversion instructions including the unsigned integer conversions unique to AVX512
Differential Revision: https://reviews.llvm.org/D49313
llvm-svn: 337066
|
|
|
|
|
|
|
|
|
|
| |
-Move BSF/BSR to the same group as TZCNT/LZCNT/POPCNT.
-Split some of the bit manipulation instructions away from TZCNT/LZCNT/POPCNT. These are things like 'x & (x - 1)' which are composed of a few simple arithmetic operations. These aren't nearly as complicated/surprising as counting bits.
-Move BEXTR/BZHI into their own group. They aren't like a simple arithmethic op or the bit manipulation instructions. They're more like a shift+and.
Differential Revision: https://reviews.llvm.org/D49312
llvm-svn: 337065
|
|
|
|
| |
llvm-svn: 337002
|
|
Spectre variant #1 for x86.
There is a lengthy, detailed RFC thread on llvm-dev which discusses the
high level issues. High level discussion is probably best there.
I've split the design document out of this patch and will land it
separately once I update it to reflect the latest edits and updates to
the Google doc used in the RFC thread.
This patch is really just an initial step. It isn't quite ready for
prime time and is only exposed via debugging flags. It has two major
limitations currently:
1) It only supports x86-64, and only certain ABIs. Many assumptions are
currently hard-coded and need to be factored out of the code here.
2) It doesn't include any options for more fine-grained control, either
of which control flow edges are significant or which loads are
important to be hardened.
3) The code is still quite rough and the testing lighter than I'd like.
However, this is enough for people to begin using. I have had numerous
requests from people to be able to experiment with this patch to
understand the trade-offs it presents and how to use it. We would also
like to encourage work to similar effect in other toolchains.
The ARM folks are actively developing a system based on this for
AArch64. We hope to merge this with their efforts when both are far
enough along. But we also don't want to block making this available on
that effort.
Many thanks to the *numerous* people who helped along the way here. For
this patch in particular, both Eric and Craig did a ton of review to
even have confidence in it as an early, rough cut at this functionality.
Differential Revision: https://reviews.llvm.org/D44824
llvm-svn: 336990
|