summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* Merging r347431:Tom Stellard2018-11-291-8/+25
| | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r347431 | rnk | 2018-11-21 14:01:10 -0800 (Wed, 21 Nov 2018) | 12 lines [mingw] Use unmangled name after the $ in the section name GCC does it this way, and we have to be consistent. This includes stdcall and fastcall functions with suffixes. I confirmed that a fastcall function named "foo" ends up in ".text$foo", not ".text$@foo@8". Based on a patch by Andrew Yohn! Fixes PR39218. Differential Revision: https://reviews.llvm.org/D54762 ------------------------------------------------------------------------ llvm-svn: 347931
* Merging r343373:Tom Stellard2018-10-191-11/+18
| | | | | | | | | | | | ------------------------------------------------------------------------ r343373 | rksimon | 2018-09-29 06:25:22 -0700 (Sat, 29 Sep 2018) | 3 lines [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats. ------------------------------------------------------------------------ llvm-svn: 344810
* Merging r343443:Tom Stellard2018-10-191-0/+48
| | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r343443 | ctopper | 2018-10-01 00:08:41 -0700 (Mon, 01 Oct 2018) | 9 lines [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers. We can only copy between a k-register and a GR32/GR64 register. This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure. This probably isn't the best fix, and we should probably figure out how to handle this correctly. Fixes PR38803. ------------------------------------------------------------------------ llvm-svn: 344804
* Merging r341512:Hans Wennborg2018-09-061-2/+2
| | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r341512 | ctopper | 2018-09-06 04:03:14 +0200 (Thu, 06 Sep 2018) | 7 lines [X86][Assembler] Allow %eip as a register in 32-bit mode for .cfi directives. This basically reverts a change made in r336217, but improves the text of the error message for not allowing IP-relative addressing in 32-bit mode. Fixes PR38826. Patch by Iain Sandoe. ------------------------------------------------------------------------ llvm-svn: 341530
* Merging r340641:Hans Wennborg2018-08-272-0/+44
| | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r340641 | stefanp | 2018-08-24 21:38:29 +0200 (Fri, 24 Aug 2018) | 9 lines [Exception Handling] Unwind tables are required for all functions that have an EH personality. This patch is for defect: https://bugs.llvm.org/show_bug.cgi?id=32611 Functions may require unwind tables even if they are marked with the attribute nounwind. Any function with an EH personality may require an unwind table. Differential Revision: https://reviews.llvm.org/D50987 ------------------------------------------------------------------------ llvm-svn: 340731
* Merging r340303:Hans Wennborg2018-08-211-0/+110
| | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r340303 | ctopper | 2018-08-21 19:15:33 +0200 (Tue, 21 Aug 2018) | 9 lines [BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block. Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all. This fixes the case from PR38649. Differential Revision: https://reviews.llvm.org/D51000 ------------------------------------------------------------------------ llvm-svn: 340359
* Merging r339945:Hans Wennborg2018-08-172-3/+317
| | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r339945 | ctopper | 2018-08-16 23:54:02 +0200 (Thu, 16 Aug 2018) | 9 lines [X86] In EFLAGS copy pass, don't emit EXTRACT_SUBREG instructions since we're after peephole Normally the peephole pass converts EXTRACT_SUBREG to COPY instructions. But we're after peephole so we can't rely on it to clean these up. To fix this, the eflags pass now emits a COPY with a subreg input. I also noticed that in 32-bit mode we need to constrain the input to the copy to ensure the subreg is valid. Otherwise we'll fail verify-machineinstrs Differential Revision: https://reviews.llvm.org/D50656 ------------------------------------------------------------------------ llvm-svn: 339999
* Merging r339536:Hans Wennborg2018-08-161-12/+47
| | | | | | | | | | | | ------------------------------------------------------------------------ r339536 | ctopper | 2018-08-13 08:53:49 +0200 (Mon, 13 Aug 2018) | 3 lines [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. ------------------------------------------------------------------------ llvm-svn: 339857
* Merging r339535:Hans Wennborg2018-08-161-0/+19
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r339535 | ctopper | 2018-08-13 08:53:47 +0200 (Mon, 13 Aug 2018) | 5 lines [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. ------------------------------------------------------------------------ llvm-svn: 339856
* Merging r339533:Hans Wennborg2018-08-161-0/+11
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r339533 | ctopper | 2018-08-13 07:26:49 +0200 (Mon, 13 Aug 2018) | 5 lines [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. ------------------------------------------------------------------------ llvm-svn: 339855
* Merging r338915:Hans Wennborg2018-08-071-0/+59
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r338915 | ctopper | 2018-08-03 22:14:18 +0200 (Fri, 03 Aug 2018) | 5 lines [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. ------------------------------------------------------------------------ llvm-svn: 339106
* Merging r338599:Hans Wennborg2018-08-031-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r338599 | vlad.tsyrklevich | 2018-08-01 19:44:37 +0200 (Wed, 01 Aug 2018) | 16 lines [X86] FastISel fall back on !absolute_symbol GVs Summary: D25878, which added support for !absolute_symbol for normal X86 ISel, did not add support for materializing references to absolute symbols for X86 FastISel. This causes build failures because FastISel generates PC-relative relocations for absolute symbols. Fall back to normal ISel for references to !absolute_symbol GVs. Fix for PR38200. Reviewers: pcc, craig.topper Reviewed By: pcc Subscribers: hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D50116 ------------------------------------------------------------------------ llvm-svn: 338847
* [X86] Adding more test patterns for lea-opt (PR37939)Jatin Bhateja2018-08-011-0/+151
| | | | | | | | Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50128 llvm-svn: 338483
* [x86] Fix a really subtle miscompile due to a somewhat glaring bug inChandler Carruth2018-08-011-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EFLAGS copy lowering. If you have a branch of LLVM, you may want to cherrypick this. It is extremely unlikely to hit this case empirically, but it will likely manifest as an "impossible" branch being taken somewhere, and will be ... very hard to debug. Hitting this requires complex conditions living across complex control flow combined with some interesting memory (non-stack) initialized with the results of a comparison. Also, because you have to arrange for an EFLAGS copy to be in *just* the right place, almost anything you do to the code will hide the bug. I was unable to reduce anything remotely resembling a "good" test case from the place where I hit it, and so instead I have constructed synthetic MIR testing that directly exercises the bug in question (as well as the good behavior for completeness). The issue is that we would mistakenly assume any SETcc with a valid condition and an initial operand that was a register and a virtual register at that to be a register *defining* SETcc... It isn't though.... This would in turn cause us to test some other bizarre register, typically the base pointer of some memory. Now, testing this register and using that to branch on doesn't make any sense. It even fails the machine verifier (if you are running it) due to the wrong register class. But it will make it through LLVM, assemble, and it *looks* fine... But wow do you get a very unsual and surprising branch taken in your actual code. The fix is to actually check what kind of SETcc instruction we're dealing with. Because there are a bunch of them, I just test the may-store bit in the instruction. I've also added an assert for sanity that ensure we are, in fact, *defining* the register operand. =D llvm-svn: 338481
* [x86/slh] Add unwind info to several tests to make it more obvious thatChandler Carruth2018-08-011-12/+48
| | | | | | | | | | | we aren't incorrectly generating any of it when doing SLH. There was a bug that only occured with SLH that very much looked like it could be caused by bad unwind info, and so this was a prime suspect. Turns out that everything is fine, but this way we'll *see* if we end up, for example, putting things we shouldn't inside the prolog. llvm-svn: 338480
* [X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)Simon Pilgrim2018-07-314-504/+235
| | | | | | | | | | As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering. Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW. Differential Revision: https://reviews.llvm.org/D49562 llvm-svn: 338407
* [X86] Add pattern matching for PMADDUBSWCraig Topper2018-07-311-1788/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned. A C example that triggers this pattern ``` static const int N = 128; int8_t A[2*N]; uint8_t B[2*N]; int16_t C[N]; void foo() { for (int i = 0; i != N; ++i) C[i] = MIN(MAX((int16_t)A[2*i]*(int16_t)B[2*i] + (int16_t)A[2*i+1]*(int16_t)B[2*i+1], -32768), 32767); } ``` Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49829 llvm-svn: 338402
* [X86] Add test cases that could use PMADDUBSW.Craig Topper2018-07-311-0/+2233
| | | | llvm-svn: 338401
* [X86] Preserve more liveness information in emitStackProbeInlineFrancis Visoiu Mistrih2018-07-312-7/+24
| | | | | | | | | | | | | | | | This commit fixes two issues with the liveness information after the call: 1) The code always spills RCX and RDX if InProlog == true, which results in an use of undefined phys reg. 2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to the basic blocks that use them, therefore they are seen undefined. https://llvm.org/PR38376 Differential Revision: https://reviews.llvm.org/D50020 llvm-svn: 338400
* [X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.Craig Topper2018-07-311-1/+0
| | | | | | In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path. llvm-svn: 338342
* [DAGCombiner] transform sub-of-shifted-signbit to addSanjay Patel2018-07-301-19/+17
| | | | | | | | | | | | | | | | This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317
* [DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a ↵David Bolvansky2018-07-302-111/+63
| | | | | | | | | | | | | | | | | | | rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270
* [X86] Regenerate NOBMI/BMI combine-select tests.Simon Pilgrim2018-07-301-34/+38
| | | | | | Test cleanup for D38128 llvm-svn: 338265
* [X86] Regenerate PKU test to merge 32/64-bit rdpkru checksSimon Pilgrim2018-07-301-11/+5
| | | | | | Test cleanup for D38128 llvm-svn: 338264
* [X86] Regenerate fast-isel tests.Simon Pilgrim2018-07-303-48/+20
| | | | | | Test cleanup for D38128 llvm-svn: 338262
* [MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall constructionFrancis Visoiu Mistrih2018-07-301-1/+1
| | | | | | | | | | | | | | | | | | The machine verifier asserts with: Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542. It calls analyzeBranch which tries to call getMBB if the opcode is JMP_1, but in this case we do: JMP_1 @OUTLINED_FUNCTION I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used with brtarget8. Differential Revision: https://reviews.llvm.org/D49299 llvm-svn: 338237
* [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)Craig Topper2018-07-287-136/+136
| | | | | | | | This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
* [X86] Add support expanding multiplies by constant where the constant is ↵Craig Topper2018-07-273-0/+294
| | | | | | | | -3/-5/-9 multplied by a power of 2. These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do. llvm-svn: 338174
* [AArch64, PowerPC, x86] add more signbit math tests; NFCSanjay Patel2018-07-271-5/+29
| | | | | | | | The tests with a constant sub operand were added with rL338143, but the potential transform doesn't have that requirement, so adding more tests with variable operands. llvm-svn: 338150
* [AArch64, PowerPC, x86] add more signbit math tests; NFCSanjay Patel2018-07-271-0/+25
| | | | llvm-svn: 338143
* [DAGCombiner] fold 'not' with signbit mathSanjay Patel2018-07-271-27/+16
| | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132
* [x86] add more tests for signbit math; NFCSanjay Patel2018-07-271-0/+84
| | | | llvm-svn: 338131
* [X86] Remove an unnecessary 'if' that prevented treating INT64_MAX and ↵Craig Topper2018-07-271-2/+27
| | | | | | | | -INT64_MAX as power of 2 minus 1 in the multiply expansion code. Not sure why they were being explicitly excluded, but I believe all the math inside the if works. I changed the absolute value to be uint64_t instead of int64_t so INT64_MIN+1 wouldn't be signed wrap. llvm-svn: 338101
* [X86] Add matching for another pattern of PMADDWD.Craig Topper2018-07-271-0/+370
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the pattern you get from the loop vectorizer for something like this int16_t A[1024]; int16_t B[1024]; int32_t C[512]; void pmaddwd() { for (int i = 0; i != 512; ++i) C[i] = (A[2*i]*B[2*i]) + (A[2*i+1]*B[2*i+1]); } In this case we will have (add (mul (build_vector), (build_vector)), (mul (build_vector), (build_vector))). This is different than the pattern we currently match which has the build_vectors between an add and a single multiply. I'm not sure what C code would get you that pattern. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49636 llvm-svn: 338097
* [X86] When removing sign extends from gather/scatter indices, make sure we ↵Craig Topper2018-07-271-0/+51
| | | | | | | | handle UpdateNodeOperands finding an existing node to CSE with. If this happens the operands aren't updated and the existing node is returned. Make sure we pass this existing node up to the DAG combiner so that a proper replacement happens. Otherwise we get stuck in an infinite loop with an unoptimized node. llvm-svn: 338090
* [SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling ↵Craig Topper2018-07-263-26/+23
| | | | | | | | | | DAG.setRoot. Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load. This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization. llvm-svn: 338085
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes.Roman Lebedev2018-07-261-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166. At least one of these cases is more canonical, so we really do have to handle it. https://godbolt.org/g/pkzP3X https://rise4fun.com/Alive/pQyhZZ We won't get to these cases with I1 being -1, as that will be constant-folded to true or false. I'm also not sure we actually hit the 'ule' case, but i think the worst think that could happen is that being dead code. Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49497 llvm-svn: 338044
* Revert "[COFF] Use comdat shared constants for MinGW as well"Martin Storsjo2018-07-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r337951. While that kind of shared constant generally works fine in a MinGW setting, it broke some cases of inline assembly that worked before: $ cat const-asm.c int MULH(int a, int b) { int rt, dummy; __asm__ ( "imull %3" :"=d"(rt), "=a"(dummy) :"a"(a), "rm"(b) ); return rt; } int func(int a) { return MULH(a, 1); } $ clang -target x86_64-win32-gnu -c const-asm.c -O2 const-asm.c:4:9: error: invalid variant '00000001' "imull %3" ^ <inline asm>:1:15: note: instantiated into assembly here imull __real@00000001(%rip) ^ A similar error is produced for i686 as well. The same test with a target of x86_64-win32-msvc or i686-win32-msvc works fine. llvm-svn: 338018
* [X86] Don't use CombineTo to skip adding new nodes to the DAGCombiner ↵Craig Topper2018-07-261-9/+9
| | | | | | | | | | | | worklist in combineMul. I'm not sure if this was trying to avoid optimizing the new nodes further or what. Or maybe to prevent a cycle if something tried to reform the multiply? But I don't think its a reliable way to do that. If the user of the expanded multiply is visited by the DAGCombiner after this conversion happens, the DAGCombiner will check its operands, see that they haven't been visited by the DAGCombiner before and it will then add the first node to the worklist. This process will repeat until all the new nodes are visited. So this seems like an unreliable prevention at best. So this patch just returns the new nodes like any other combine. If this starts causing problems we can try to add target specific nodes or something to more directly prevent optimizations. Now that we handle the combine normally, we can combine any negates the mul expansion creates into their users since those will be visited now. llvm-svn: 338007
* [SelectionDAG] try to convert funnel shift directly to rotate if legalSanjay Patel2018-07-251-4/+4
| | | | | | | | | | | If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966
* [COFF] Use comdat shared constants for MinGW as wellMartin Storsjo2018-07-251-9/+1
| | | | | | | | | | | | | | | | | GNU binutils tools have no problems with this kind of shared constants, provided that we actually hook it up completely in AsmPrinter and produce a global symbol. This effectively reverts SVN r335918 by hooking the rest of it up properly. This feature was implemented originally in SVN r213006, with no reason for why it can't be used for MinGW other than the fact that GCC doesn't do it while MSVC does. Differential Revision: https://reviews.llvm.org/D49646 llvm-svn: 337951
* Fix corruption of result number in LegalizeVectorOps.cppUlrich Weigand2018-07-251-14/+14
| | | | | | | | | | | | When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939
* [X86] Autogenerate complete checks and fix a failure introduced in r337875.Craig Topper2018-07-251-57/+178
| | | | llvm-svn: 337889
* [x86/SLH] Teach the x86 speculative load hardening pass to hardenChandler Carruth2018-07-251-16/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in *any way* mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878
* [X86] Use a shift plus an lea for multiplying by a constant that is a power ↵Craig Topper2018-07-253-22/+154
| | | | | | | | of 2 plus 2/4/8. The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2. llvm-svn: 337875
* [X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we ↵Craig Topper2018-07-253-0/+147
| | | | | | do for pow2 - 2. llvm-svn: 337874
* [X86] Use a two lea sequence for multiply by 37, 41, and 73.Craig Topper2018-07-244-59/+105
| | | | | | These fit a pattern used by 11, 21, and 19. llvm-svn: 337871
* [X86] Add test cases for multiply by 37, 41, and 73.Craig Topper2018-07-243-0/+330
| | | | | | These can all be handled with 2 LEAs similar to what we do for 11, 19, 21. llvm-svn: 337870
* [X86] Change multiply by 26 to use two multiplies by 5 and an add instead of ↵Craig Topper2018-07-244-30/+34
| | | | | | | | multiply by 3 and 9 and a subtract. Same number of operations, but ending in an add is friendlier due to it being commutable. llvm-svn: 337869
* [X86] When expanding a multiply by a negative of one less than a power of 2, ↵Craig Topper2018-07-241-12/+11
| | | | | | | | | | like 31, don't generate a negate of a subtract that we'll never optimize. We generated a subtract for the power of 2 minus one then negated the result. The negate can be optimized away by swapping the subtract operands, but DAG combine doesn't know how to do that and we don't add any of the new nodes to the worklist anyway. This patch makes use explicitly emit the swapped subtract. llvm-svn: 337858
OpenPOWER on IntegriCloud