bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merging r343373:	Tom Stellard	2018-10-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r343373 \| rksimon \| 2018-09-29 06:25:22 -0700 (Sat, 29 Sep 2018) \| 3 lines [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats. ------------------------------------------------------------------------ llvm-svn: 344810
*	Merging r343428:	Tom Stellard	2018-10-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r343428 \| ctopper \| 2018-09-30 16:43:30 -0700 (Sun, 30 Sep 2018) \| 3 lines [X86] Change an llvm_unreachable to a report_fatal_error so the optimizer will stop making us reach the other report_fatal_error in this function. There's a conditional report_fatal_error just above this llvm_unreachable. The optimizer when seeing the unreachable removes the conditional and just makes any other error trigger the existing report_fatal_error. ------------------------------------------------------------------------ llvm-svn: 344805
*	Merging r343443:	Tom Stellard	2018-10-19	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r343443 \| ctopper \| 2018-10-01 00:08:41 -0700 (Mon, 01 Oct 2018) \| 9 lines [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers. We can only copy between a k-register and a GR32/GR64 register. This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure. This probably isn't the best fix, and we should probably figure out how to handle this correctly. Fixes PR38803. ------------------------------------------------------------------------ llvm-svn: 344804
*	Merging r341512:	Hans Wennborg	2018-09-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r341512 \| ctopper \| 2018-09-06 04:03:14 +0200 (Thu, 06 Sep 2018) \| 7 lines [X86][Assembler] Allow %eip as a register in 32-bit mode for .cfi directives. This basically reverts a change made in r336217, but improves the text of the error message for not allowing IP-relative addressing in 32-bit mode. Fixes PR38826. Patch by Iain Sandoe. ------------------------------------------------------------------------ llvm-svn: 341530
*	Merging r339895 and r339896:	Hans Wennborg	2018-08-21	2	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r339895 \| niravd \| 2018-08-16 18:31:14 +0200 (Thu, 16 Aug 2018) \| 13 lines [MC][X86] Enhance X86 Register expression handling to more closely match GCC. Allow the comparison of x86 registers in the evaluation of assembler directives. This generalizes and simplifies the extension from r334022 to catch another case found in the Linux kernel. Reviewers: rnk, void Reviewed By: rnk Subscribers: hiraditya, nickdesaulniers, llvm-commits Differential Revision: https://reviews.llvm.org/D50795 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r339896 \| d0k \| 2018-08-16 18:50:23 +0200 (Thu, 16 Aug 2018) \| 1 line [MC] Remove unused variable ------------------------------------------------------------------------ llvm-svn: 340329
*	Merging r339945:	Hans Wennborg	2018-08-17	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r339945 \| ctopper \| 2018-08-16 23:54:02 +0200 (Thu, 16 Aug 2018) \| 9 lines [X86] In EFLAGS copy pass, don't emit EXTRACT_SUBREG instructions since we're after peephole Normally the peephole pass converts EXTRACT_SUBREG to COPY instructions. But we're after peephole so we can't rely on it to clean these up. To fix this, the eflags pass now emits a COPY with a subreg input. I also noticed that in 32-bit mode we need to constrain the input to the copy to ensure the subreg is valid. Otherwise we'll fail verify-machineinstrs Differential Revision: https://reviews.llvm.org/D50656 ------------------------------------------------------------------------ llvm-svn: 339999
*	Merging r338599:	Hans Wennborg	2018-08-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r338599 \| vlad.tsyrklevich \| 2018-08-01 19:44:37 +0200 (Wed, 01 Aug 2018) \| 16 lines [X86] FastISel fall back on !absolute_symbol GVs Summary: D25878, which added support for !absolute_symbol for normal X86 ISel, did not add support for materializing references to absolute symbols for X86 FastISel. This causes build failures because FastISel generates PC-relative relocations for absolute symbols. Fall back to normal ISel for references to !absolute_symbol GVs. Fix for PR38200. Reviewers: pcc, craig.topper Reviewed By: pcc Subscribers: hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D50116 ------------------------------------------------------------------------ llvm-svn: 338847
*	[X86] Use isNullConstant helper. NFCI.	Simon Pilgrim	2018-08-01	1	-2/+1
\| \| \| \|	llvm-svn: 338530
*	[X86] Use isNullConstant helper. NFCI.	Simon Pilgrim	2018-08-01	1	-2/+1
\| \| \| \|	llvm-svn: 338516
*	[X86] Improved sched models for X86 BT*rr instructions.	Andrew V. Tischenko	2018-08-01	11	-48/+18
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49243 llvm-svn: 338507
*	[X86] When looking for (CMOV C-1, (ADD (CTTZ X), C), (X != 0)) -> (ADD (CMOV ↵	Craig Topper	2018-08-01	1	-27/+26
\| \| \| \| \| \| \| \| \| \|	(CTTZ X), -1, (X != 0)), C), make sure we really have a compare with 0. It's not strictly required by the transform of the cmov and the add, but it makes sure we restrict it to the cases we know we want to match. While there canonicalize the operand order of the cmov to simplify the matching and emitting code. llvm-svn: 338492
*	[x86] Fix a really subtle miscompile due to a somewhat glaring bug in	Chandler Carruth	2018-08-01	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EFLAGS copy lowering. If you have a branch of LLVM, you may want to cherrypick this. It is extremely unlikely to hit this case empirically, but it will likely manifest as an "impossible" branch being taken somewhere, and will be ... very hard to debug. Hitting this requires complex conditions living across complex control flow combined with some interesting memory (non-stack) initialized with the results of a comparison. Also, because you have to arrange for an EFLAGS copy to be in just the right place, almost anything you do to the code will hide the bug. I was unable to reduce anything remotely resembling a "good" test case from the place where I hit it, and so instead I have constructed synthetic MIR testing that directly exercises the bug in question (as well as the good behavior for completeness). The issue is that we would mistakenly assume any SETcc with a valid condition and an initial operand that was a register and a virtual register at that to be a register defining SETcc... It isn't though.... This would in turn cause us to test some other bizarre register, typically the base pointer of some memory. Now, testing this register and using that to branch on doesn't make any sense. It even fails the machine verifier (if you are running it) due to the wrong register class. But it will make it through LLVM, assemble, and it looks fine... But wow do you get a very unsual and surprising branch taken in your actual code. The fix is to actually check what kind of SETcc instruction we're dealing with. Because there are a bunch of them, I just test the may-store bit in the instruction. I've also added an assert for sanity that ensure we are, in fact, defining the register operand. =D llvm-svn: 338481
*	[X86] WriteBSWAP sched classes are reg-reg only.	Simon Pilgrim	2018-07-31	10	-20/+20
\| \| \| \| \| \| \|	Don't declare them as X86SchedWritePair when the folded class will never be used. Note: MOVBE (load/store endian conversion) instructions tend to have a very different behaviour to BSWAP. llvm-svn: 338412
*	[X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)	Simon Pilgrim	2018-07-31	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering. Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW. Differential Revision: https://reviews.llvm.org/D49562 llvm-svn: 338407
*	[X86] Add pattern matching for PMADDUBSW	Craig Topper	2018-07-31	1	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned. A C example that triggers this pattern ``` static const int N = 128; int8_t A[2N]; uint8_t B[2N]; int16_t C[N]; void foo() { for (int i = 0; i != N; ++i) C[i] = MIN(MAX((int16_t)A[2i](int16_t)B[2i] + (int16_t)A[2i+1](int16_t)B[2i+1], -32768), 32767); } ``` Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49829 llvm-svn: 338402
*	[X86] Preserve more liveness information in emitStackProbeInline	Francis Visoiu Mistrih	2018-07-31	1	-18/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes two issues with the liveness information after the call: 1) The code always spills RCX and RDX if InProlog == true, which results in an use of undefined phys reg. 2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to the basic blocks that use them, therefore they are seen undefined. https://llvm.org/PR38376 Differential Revision: https://reviews.llvm.org/D50020 llvm-svn: 338400
*	[llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.	Andrea Di Biagio	2018-07-31	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches llvm-mca how to identify dependency breaking instructions on btver2. An example of dependency breaking instructions is the zero-idiom XOR (example: `XOR %eax, %eax`), which always generates zero regardless of the actual value of the input register operands. Dependency breaking instructions don't have to wait on their input register operands before executing. This is because the computation is not dependent on the inputs. Not all dependency breaking idioms are also zero-latency instructions. For example, `CMPEQ %xmm1, %xmm1` is independent on the value of XMM1, and it generates a vector of all-ones. That instruction is not eliminated at register renaming stage, and its opcode is issued to a pipeline for execution. So, the latency is not zero. This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis interface. That method takes as input an instruction (i.e. MCInst) and a MCSubtargetInfo. The default implementation of isDependencyBreaking() conservatively returns false for all instructions. Targets may override the default behavior for specific CPUs, and return a value which better matches the subtarget behavior. In future, we should teach to Tablegen how to automatically generate the body of isDependencyBreaking from scheduling predicate definitions. This would allow us to expose the knowledge about dependency breaking instructions to the machine schedulers (and, potentially, other codegen passes). Differential Revision: https://reviews.llvm.org/D49310 llvm-svn: 338372
*	Revert r338365: [X86] Improved sched models for X86 BT*rr instructions.	Simon Pilgrim	2018-07-31	11	-18/+48
\| \| \| \| \| \| \| \|	https://reviews.llvm.org/D49243 Contains WIP code that should not have been included. llvm-svn: 338369
*	[X86] Improved sched models for X86 BT*rr instructions.	Andrew V. Tischenko	2018-07-31	11	-48/+18
\| \| \| \| \| \|	https://reviews.llvm.org/D49243 llvm-svn: 338365
*	[X86] Improved sched models for X86 SHLD/SHRD* instructions.	Andrew V. Tischenko	2018-07-31	11	-196/+70
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D9611 llvm-svn: 338359
*	[X86][SSE] isFNEG - Use getTargetConstantBitsFromNode to handle all constant ↵	Simon Pilgrim	2018-07-31	1	-31/+7
\| \| \| \| \| \| \| \| \| \|	cases isFNEG was duplicating much of what was done by getTargetConstantBitsFromNode in its own calls to getTargetConstantFromNode. Noticed while reviewing D48467. llvm-svn: 338358
*	[X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.	Craig Topper	2018-07-31	1	-1/+1
\| \| \| \| \| \|	In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path. llvm-svn: 338342
*	Remove trailing space	Fangrui Song	2018-07-30	11	-21/+21
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[X86] Fix typo in comment. NFC	Craig Topper	2018-07-30	1	-1/+1
\| \| \| \|	llvm-svn: 338274
*	Recommit r338204 "[X86] Correct the immediate cost for 'add/sub i64 %x, ↵	Craig Topper	2018-07-30	1	-1/+7
\| \| \| \| \| \| \| \|	0x80000000'." This checks in a more direct way without triggering a UBSAN error. llvm-svn: 338273
*	[MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall construction	Francis Visoiu Mistrih	2018-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The machine verifier asserts with: Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542. It calls analyzeBranch which tries to call getMBB if the opcode is JMP_1, but in this case we do: JMP_1 @OUTLINED_FUNCTION I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used with brtarget8. Differential Revision: https://reviews.llvm.org/D49299 llvm-svn: 338237
*	Revert "[X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'."	Dean Michael Berris	2018-07-30	1	-7/+1
\| \| \| \| \| \|	This reverts commit r338204. llvm-svn: 338236
*	[X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'.	Craig Topper	2018-07-28	1	-1/+7
\| \| \| \| \| \|	X86 normally requires immediates to be a signed 32-bit value which would exclude i64 0x80000000. But for add/sub we can negate the constant and use the opposite instruction. llvm-svn: 338204
*	[X86] Use alignTo and divideCeil to make some code more readable. NFC	Craig Topper	2018-07-28	1	-3/+3
\| \| \| \|	llvm-svn: 338203
*	DAG: Add calling convention argument to calling convention funcs	Matt Arsenault	2018-07-28	3	-4/+7
\| \| \| \| \| \| \| \|	This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
*	[X86] Add support expanding multiplies by constant where the constant is ↵	Craig Topper	2018-07-27	1	-18/+31
\| \| \| \| \| \| \| \|	-3/-5/-9 multplied by a power of 2. These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do. llvm-svn: 338174
*	[X86] Remove an unnecessary 'if' that prevented treating INT64_MAX and ↵	Craig Topper	2018-07-27	1	-38/+36
\| \| \| \| \| \| \| \|	-INT64_MAX as power of 2 minus 1 in the multiply expansion code. Not sure why they were being explicitly excluded, but I believe all the math inside the if works. I changed the absolute value to be uint64_t instead of int64_t so INT64_MIN+1 wouldn't be signed wrap. llvm-svn: 338101
*	[X86] Add matching for another pattern of PMADDWD.	Craig Topper	2018-07-27	1	-0/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the pattern you get from the loop vectorizer for something like this int16_t A[1024]; int16_t B[1024]; int32_t C[512]; void pmaddwd() { for (int i = 0; i != 512; ++i) C[i] = (A[2i]B[2i]) + (A[2i+1]B[2i+1]); } In this case we will have (add (mul (build_vector), (build_vector)), (mul (build_vector), (build_vector))). This is different than the pattern we currently match which has the build_vectors between an add and a single multiply. I'm not sure what C code would get you that pattern. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49636 llvm-svn: 338097
*	[X86] When removing sign extends from gather/scatter indices, make sure we ↵	Craig Topper	2018-07-27	1	-15/+20
\| \| \| \| \| \| \| \|	handle UpdateNodeOperands finding an existing node to CSE with. If this happens the operands aren't updated and the existing node is returned. Make sure we pass this existing node up to the DAG combiner so that a proper replacement happens. Otherwise we get stuck in an infinite loop with an unoptimized node. llvm-svn: 338090
*	[x86/SLH] Extract the logic to trace predicate state through calls to	Chandler Carruth	2018-07-26	1	-19/+39
\| \| \| \| \| \| \| \| \| \|	a helper function with a nice overview comment. NFC. This is a preperatory refactoring to implementing another component of mitigation here that was descibed in the design document but hadn't been implemented yet. llvm-svn: 338016
*	[X86] Don't use CombineTo to skip adding new nodes to the DAGCombiner ↵	Craig Topper	2018-07-26	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \|	worklist in combineMul. I'm not sure if this was trying to avoid optimizing the new nodes further or what. Or maybe to prevent a cycle if something tried to reform the multiply? But I don't think its a reliable way to do that. If the user of the expanded multiply is visited by the DAGCombiner after this conversion happens, the DAGCombiner will check its operands, see that they haven't been visited by the DAGCombiner before and it will then add the first node to the worklist. This process will repeat until all the new nodes are visited. So this seems like an unreliable prevention at best. So this patch just returns the new nodes like any other combine. If this starts causing problems we can try to add target specific nodes or something to more directly prevent optimizations. Now that we handle the combine normally, we can combine any negates the mul expansion creates into their users since those will be visited now. llvm-svn: 338007
*	[X86] Remove some unnecessary explicit calls to DCI.AddToWorkList.	Craig Topper	2018-07-26	1	-10/+0
\| \| \| \| \| \|	These calls were making sure some newly created nodes were added to worklist, but the DAGCombiner has internal support for ensuring it has visited all nodes. Any time it visits a node it ensures the operands have been queued to be visited as well. This means if we only need to return the last new node. The DAGCombiner will take care of adding its inputs thus walking backwards through all the new nodes. llvm-svn: 337996
*	CodeGen: Cleanup regmask construction; NFC	Matthias Braun	2018-07-26	1	-3/+3
\| \| \| \| \| \| \| \| \|	- Avoid duplication of regmask size calculation. - Simplify allocateRegisterMask() call. - Rename allocateRegisterMask() to allocateRegMask() to be consistent with naming in MachineOperand. llvm-svn: 337986
*	[COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinter	Martin Storsjo	2018-07-25	2	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In SVN r334523, the first half of comdat constant pool handling was hoisted from X86WindowsTargetObjectFile (which despite the name only was used for msvc targets) into the arch independent TargetLoweringObjectFileCOFF, but the other half of the handling was left behind in X86AsmPrinter::GetCPISymbol. With only half of the handling in place, inconsistent comdat sections/symbols are created, causing issues with both GNU binutils (avoided for X86 in SVN r335918) and with the MS linker, which would complain like this: fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4 Differential Revision: https://reviews.llvm.org/D49644 llvm-svn: 337950
*	[x86/SLH] Sink the return hardening into the main block-walk + hardening	Chandler Carruth	2018-07-25	1	-26/+17
\| \| \| \| \| \| \| \| \| \| \|	code. This consolidates all our hardening calls, and simplifies the code a bit. It seems much more clear to handle all of these together. No functionality changed here. llvm-svn: 337895
*	[x86/SLH] Improve name and comments for the main hardening function.	Chandler Carruth	2018-07-25	1	-174/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function actually does two things: it traces the predicate state through each of the basic blocks in the function (as that isn't directly handled by the SSA updater) and it hardens everything necessary in the block as it goes. These need to be done together so that we have the currently active predicate state to use at each point of the hardening. However, this also made obvious that the flag to disable actual hardening of loads was flawed -- it also disabled tracing the predicate state across function calls within the body of each block. So this patch sinks this debugging flag test to correctly guard just the hardening of loads. Unless load hardening was disabled, no functionality should change with tis patch. llvm-svn: 337894
*	[X86] Use X86ISD::MUL_IMM instead of ISD::MUL for multiply we intend to be ↵	Craig Topper	2018-07-25	1	-1/+2
\| \| \| \| \| \| \| \|	selected to LEA. This prevents other combines from possibly disturbing it. llvm-svn: 337890
*	[x86/SLH] Teach the x86 speculative load hardening pass to harden	Chandler Carruth	2018-07-25	1	-0/+200
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in any way mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878
*	[X86] Use a shift plus an lea for multiplying by a constant that is a power ↵	Craig Topper	2018-07-25	1	-0/+18
\| \| \| \| \| \| \| \|	of 2 plus 2/4/8. The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2. llvm-svn: 337875
*	[X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we ↵	Craig Topper	2018-07-25	1	-11/+15
\| \| \| \| \| \|	do for pow2 - 2. llvm-svn: 337874
*	[X86] Use a two lea sequence for multiply by 37, 41, and 73.	Craig Topper	2018-07-24	1	-0/+9
\| \| \| \| \| \|	These fit a pattern used by 11, 21, and 19. llvm-svn: 337871
*	[X86] Change multiply by 26 to use two multiplies by 5 and an add instead of ↵	Craig Topper	2018-07-24	1	-7/+7
\| \| \| \| \| \| \| \|	multiply by 3 and 9 and a subtract. Same number of operations, but ending in an add is friendlier due to it being commutable. llvm-svn: 337869
*	[X86] When expanding a multiply by a negative of one less than a power of 2, ↵	Craig Topper	2018-07-24	1	-10/+12
\| \| \| \| \| \| \| \| \| \|	like 31, don't generate a negate of a subtract that we'll never optimize. We generated a subtract for the power of 2 minus one then negated the result. The negate can be optimized away by swapping the subtract operands, but DAG combine doesn't know how to do that and we don't add any of the new nodes to the worklist anyway. This patch makes use explicitly emit the swapped subtract. llvm-svn: 337858
*	[X86] Generalize the multiply by 30 lowering to generic multipy by power 2 ↵	Craig Topper	2018-07-24	1	-15/+10
\| \| \| \| \| \| \| \| \| \|	minus 2. Use a left shift and 2 subtracts like we do for 30. Move this out from behind the slow lea check since it doesn't even use an LEA. Use this for multiply by 14 as well. llvm-svn: 337856
*	[X86] Change multiply by 19 to use (9 * X) * 2 + X instead of (5 * X) * 4 - 1.	Craig Topper	2018-07-24	1	-2/+2
\| \| \| \| \| \|	The new lowering can be done in 2 LEAs. The old code took 1 LEA, 1 shift, and 1 sub. llvm-svn: 337851