bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to ↵	Scott Constable	2020-06-24	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	"Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810
*	[X86] Use TargetConstant for condition code on X86ISD::SETCC/CMOV/BRCOND nodes.	Craig Topper	2019-09-23	1	-72/+55
\| \| \| \| \| \| \| \| \| \|	This removes the need for ConvertToTarget opcodes in the isel table. It's also consistent with the recent changes to use TargetConstant for intrinsic nodes that always take immediates. Differential Revision: https://reviews.llvm.org/D67902 llvm-svn: 372645
*	Rename nonvolatile_load/store to simple_load/store [NFC]	Philip Reames	2019-09-12	1	-6/+6
\| \| \| \| \| \|	Implement the TODO from D66318. llvm-svn: 371789
*	Revert [Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn	Reid Kleckner	2019-09-03	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r370525 (git commit 0bb1630685fba255fa93def92603f064c2ffd203) Also reverts r370543 (git commit 185ddc08eed6542781040b8499ef7ad15c8ae9f4) The approach I took only works for functions marked `noreturn`. In general, a call that is not known to be noreturn may be followed by unreachable for other reasons. For example, there could be multiple call sites to a function that throws sometimes, and at some call sites, it is known to always throw, so it is followed by unreachable. We need to insert an `int3` in these cases to pacify the Windows unwinder. I think this probably deserves its own standalone, Win64-only fixup pass that runs after block placement. Implementing that will take some time, so let's revert to TrapUnreachable in the mean time. llvm-svn: 370829
*	[Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn	Reid Kleckner	2019-08-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Users have complained llvm.trap produce two ud2 instructions on Win64, one for the trap, and one for unreachable. This change fixes that. TrapUnreachable was added and enabled for Win64 in r206684 (April 2014) to avoid poorly understood issues with the Windows unwinder. There seem to be two major things in play: - the unwinder - C++ EH, _CxxFrameHandler3 & co The unwinder disassembles forward from the return address to scan for epilogues. Inserting a ud2 had the effect of stopping the unwinder, and ensuring that it ran the EH personality function for the current frame. However, it's not clear what the unwinder does when the return address happens to be the last address of one function and the first address of the next function. The Visual C++ EH personality, _CxxFrameHandler3, needs to figure out what the current EH state number is. It does this by consulting the ip2state table, which maps from PC to state number. This seems to go wrong when the return address is the last PC of the function or catch funclet. I'm not sure precisely which system is involved here, but in order to address these real or hypothetical problems, I believe it is enough to insert int3 after a call site if it would otherwise be the last instruction in a function or funclet. I was able to reproduce some similar problems locally by arranging for a noreturn call to appear at the end of a catch block immediately before an unrelated function, and I confirmed that the problems go away when an extra trailing int3 instruction is added. MSVC inserts int3 after every noreturn function call, but I believe it's only necessary to do it if the call would be the last instruction. This change inserts a pseudo instruction that expands to int3 if it is in the last basic block of a function or funclet. I did what I could to run the Microsoft compiler EH tests, and the ones I was able to run showed no behavior difference before or after this change. Differential Revision: https://reviews.llvm.org/D66980 llvm-svn: 370525
*	[X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class.	Craig Topper	2019-06-18	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt. Use the new versions in patterns that previously used a COPY_TO_REGCLASS to VR128. These patterns expect the upper bits to be zero. The current set up appears to work, but I'm not sure we should be enforcing upper bits being zero through a COPY_TO_REGCLASS. I wanted to flip the arrangement and use a COPY_TO_REGCLASS to FR32/FR64 for the patterns that need an f32/f64 result, but that complicated fastisel and globalisel. I've been doing some experiments with reducing some isel patterns and ended up in a situation where I had a (SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our post-isel peephole was unable to avoid using an instruction for the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128 instruction removes the COPY_TO_REGCLASS that was breaking this. llvm-svn: 363643
*	[X86] Add CMOV_FR32X/CMOV_FR64X pseudo instructions. Use them in fast isel ↵	Craig Topper	2019-05-11	1	-2/+8
\| \| \| \| \| \| \| \|	to fix a machine verifier error after adding test cases. Fast isel picks the FR32X/FR64X register classes when lowering pseudo select, but it didn't have the right opcode to go with it. llvm-svn: 360524
*	[X86] Put the locked mi8 instrutions above the locked mi/mi32 so they will ↵	Craig Topper	2019-04-14	1	-24/+26
\| \| \| \| \| \| \| \| \|	be prefered. We want 64mi8 to be prefered over 64mi32. The order for 16mi/32mi doesn't really matter. llvm-svn: 358361
*	[X86] Add patterns for using movss/movsd for atomic load/store of f32/64. ↵	Craig Topper	2019-04-11	1	-19/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove atomic fadd pseudos use isel patterns instead. This patch adds patterns for turning bitcasted atomic load/store into movss/sd. It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part. This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled. Differential Revision: https://reviews.llvm.org/D60394 llvm-svn: 358215
*	[X86] Use (SUBREG_TO_REG (MOV32rm)) for extloadi64i8/extloadi64i16 when the ↵	Craig Topper	2019-04-07	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load is 4 byte aligned or better and not volatile. Summary: Previously we would use MOVZXrm8/MOVZXrm16, but those are longer encodings. This is similar to what we do in the loadi32 predicate. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60341 llvm-svn: 357875
*	[X86] Merge the different SETcc instructions for each condition code into ↵	Craig Topper	2019-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801
*	[X86] Merge the different CMOV instructions for each condition code into ↵	Craig Topper	2019-04-05	1	-27/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800
*	[X86] Remove GetLo8XForm and use GetLo32XForm instead. NFCI	Craig Topper	2019-03-25	1	-6/+1
\| \| \| \| \| \| \| \|	We were using this to create an AND32ri8 node from a 64-bit and, but that node normally still uses a 32-bit immediate. So we should just truncate the existing immediate to i32. We already verified it has the same value in bits 31:7. llvm-svn: 356868
*	Revert r356688 "[X86] Don't avoid folding multiple use sign extended 8-bit ↵	Craig Topper	2019-03-25	1	-2/+2
\| \| \| \| \| \| \| \|	immediate into instructions under optsize." Looking back over how the one use optimization works, I don't think this is the right way to fix this. llvm-svn: 356866
*	[X86] Don't avoid folding multiple use sign extended 8-bit immediate into ↵	Craig Topper	2019-03-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	instructions under optsize. Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses. Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes. Differential Revision: https://reviews.llvm.org/D59522 llvm-svn: 356688
*	[X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 ↵	Craig Topper	2019-03-20	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	including 'generic'. Disable use of CMPXCHG8B when this flag isn't set. CMPXCHG8B was introduced on i586/pentium generation. If its not enabled, limit the atomic width to 32 bits so the AtomicExpandPass will expand to lib calls. Unclear if we should be using a different limit for other configs. The default is 1024 and experimentation shows that using an i256 atomic will cause a crash in SelectionDAG. Differential Revision: https://reviews.llvm.org/D59576 llvm-svn: 356631
*	[X86] Re-disable cmpxchg16b for 32-bit mode assembly parsing.	Craig Topper	2019-03-19	1	-2/+2
\| \| \| \| \| \|	This was broken recently when I factored the 64 bit mode check into hasCmpxchg16 without thinking about the AssemblerPredicate. llvm-svn: 356531
*	[X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.	Craig Topper	2019-03-18	1	-2/+0
\| \| \| \| \| \|	For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well. llvm-svn: 356406
*	[X86] Make ADD*_DB post-RA pseudos and expand them in expandPostRAPseudo.	Craig Topper	2019-03-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	These are used to help convert OR->LEA when needed to avoid avoid a copy. They aren't need after register allocation. Happens to remove an ugly goto from X86MCCodeEmitter.cpp llvm-svn: 356356
*	[X86] Enable the add with 128 -> sub with -128 encoding trick with ↵	Craig Topper	2019-03-06	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	X86ISD::ADD when the carry flag isn't used. This allows us to use an 8-bit sign extended immediate instead of a 16 or 32 bit immediate. Also do similar for 0x80000000 with 64-bit adds to avoid having to use a movabsq. llvm-svn: 355485
*	[X86] Enable 8-bit OR with disjoint bits to convert to LEA	Craig Topper	2019-03-05	1	-0/+7
\| \| \| \| \| \| \| \|	We already support 8-bits adds in convertToThreeAddress. But we can also support 8-bit OR if the bits are disjoint. We already do this for 16/32/64. Differential Revision: https://reviews.llvm.org/D58863 llvm-svn: 355423
*	[X86] Improve detection of unneeded shift amount masking to also handle the ↵	Craig Topper	2019-02-25	1	-47/+50
\| \| \| \| \| \| \| \| \| \|	case that the LHS has known zeroes in it If the LHS has known zeros, the RHS immediate will have had bits removed. So call computeKnownBits to get the known zeroes so we can handle this case. Differential Revision: https://reviews.llvm.org/D58475 llvm-svn: 354811
*	[X86] Add a pattern for (i64 (and (anyext def32:), 0x00000000FFFFFFFF)) to ↵	Craig Topper	2019-01-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	produce SUBREG_TO_REG def32 here means the producing instruction zeroed bits 63:32. We already do this for zext, but it looks like we can get an and+anyext sometimes. Spotted in the diffs from D33587. llvm-svn: 352303
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[X86] Change some patterns that select MOVZX16rm8 to instead select ↵	Craig Topper	2019-01-12	1	-3/+6
\| \| \| \| \| \| \| \|	MOVZX32rm8 and extract the subregister. This should be a shorter encoding and is consistent with what we do for zext i8->i16 llvm-svn: 350988
*	[X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel time	Craig Topper	2019-01-02	1	-44/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag. This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used. I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag. This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes. Differential Revision: https://reviews.llvm.org/D55975 llvm-svn: 350245
*	[X86] Fix an old FIXME about folding the zero constant into the OR ↵	Craig Topper	2018-12-23	1	-5/+4
\| \| \| \| \| \|	instruction we use for sequentially consistent fence in 32-bit mode without SSE2. llvm-svn: 350013
*	[X86] Always use the version of computeKnownBits that returns a value. NFCI.	Simon Pilgrim	2018-12-21	1	-6/+3
\| \| \| \| \| \|	Continues the work started by @bogner in rL340594 to remove uses of the old KnownBits output paramater version. llvm-svn: 349902
*	[X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false ↵	Craig Topper	2018-12-12	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \|	dependency on the SBB input. I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that. I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful. Differential Revision: https://reviews.llvm.org/D55414 llvm-svn: 348959
*	[X86] Directly create ADC/SBB nodes instead of using ADD/SUB with (and ↵	Craig Topper	2018-12-06	1	-24/+0
\| \| \| \| \| \| \| \| \| \| \| \|	SETCC_CARRY, 1) This addresses a FIXME and avoids depending on an isel pattern match I think. I've remove the isel patterns too since he have no lit tests left that cover them. Hopefully that really means they are unused. I'm trying to decide if we need SETCC_CARRY. This removes one of its usages. Differential Revision: https://reviews.llvm.org/D55355 llvm-svn: 348536
*	Bias physical register immediate assignments	Nirav Dave	2018-11-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The machine scheduler currently biases register copies to/from physical registers to be closer to their point of use / def to minimize their live ranges. This change extends this to also physical register assignments from immediate values. This causes a reduction in reduction in overall register pressure and minor reduction in spills and indirectly fixes an out-of-registers assertion (PR39391). Most test changes are from minor instruction reorderings and register name selection changes and direct consequences of that. Reviewers: MatzeB, qcolombet, myatsina, pcc Subscribers: nemanjai, jvesely, nhaehnle, eraman, hiraditya, javed.absar, arphaman, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54218 llvm-svn: 346894
*	[X86] Use a MOVSX instruction instead of a MOVZX instruction in isel for an ↵	Craig Topper	2018-11-10	1	-0/+9
\| \| \| \| \| \| \| \|	any_extend of the remainder from an 8-bit sdivrem. The sdivrem will emit its own MOVSX to move %ah to the low byte of a register. By using a MOVSX for an any_extend this allows a post-isel peephole to merge them. llvm-svn: 346581
*	Revert r345165 "[X86] Bring back the MOV64r0 pseudo instruction"	Craig Topper	2018-10-31	1	-4/+2
\| \| \| \| \| \|	Google is reporting regressions on some benchmarks. llvm-svn: 345785
*	[ELF] Fix large code model MIR verifier errors	Reid Kleckner	2018-10-24	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \|	Instead of using the MOVGOT64r pseudo, use the existing MO_PIC_BASE_OFFSET support on symbol operands. Now I don't have to create a "scratch register operand" for the pseudo to use, and the register allocator can make better decisions. Fixes some X86 verifier errors tracked in PR27481. llvm-svn: 345219
*	[X86] Bring back the MOV64r0 pseudo instruction	Craig Topper	2018-10-24	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos. My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at. There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling. Differential Revision: https://reviews.llvm.org/D52757 llvm-svn: 345165
*	[X86] Restore X86ISelDAGToDAG::matchBEXTRFromAnd. Teach address matching to ↵	Craig Topper	2018-10-11	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	create a BEXTR pattern from a (shl (and X, mask >> C1) if C1 can be folded into addressing mode. This is an alternative to D53080 since I think using a BEXTR for a shifted mask is definitely an improvement when the shl can be absorbed into addressing mode. The other cases I'm less sure about. We already have several tricks for handling an and of a shift in address matching. This adds a new case for BEXTR. I've moved the BEXTR matching code back to X86ISelDAGToDAG to allow it to match. I suppose alternatively we could directly emit a X86ISD::BEXTR node that isel could pattern match. But I'm trying to view BEXTR matching as an isel concern so DAG combine can see 'and' and 'shift' operations that are well understood. We did lose a couple cases from tbm_patterns.ll, but I think there are ways to recover that. I've also put back the manual load folding code in matchBEXTRFromAnd that I removed a few months ago in r324939. This gives us some more freedom to make decisions based on the ability to fold a load. I haven't done anything with that yet. Differential Revision: https://reviews.llvm.org/D53126 llvm-svn: 344270
*	[X86] Stop promoting vector ISD::SELECT to vXi64.	Craig Topper	2018-10-03	1	-0/+32
\| \| \| \| \| \|	The additional patterns needed for this aren't overwhelming and introducing extra bitcasts during lowering limits our ability to do computeNumSignBits. Not that I have a good example of that for select. I'm just becoming increasingly grumpy about promotion of AND/OR/XOR. SELECT was just a lot easier to fix. llvm-svn: 343723
*	[X86] Add CMOV_VK2/VK4 pseudos and remove lowering code that turned ↵	Craig Topper	2018-10-03	1	-0/+2
\| \| \| \| \| \|	v2i1/v4i1 SELECT into v8i1. llvm-svn: 343713
*	[X86] Add CMOV pseudos for VR128X and VR256X register classes. Use them when ↵	Craig Topper	2018-10-03	1	-10/+30
\| \| \| \| \| \| \| \|	AVX512VL is enabled. This allows the phi nodes to be generated with the correct register class when expanded. llvm-svn: 343710
*	[X86] Don't break CMOV pseudo instructions down by type. Just by register class.	Craig Topper	2018-10-03	1	-14/+22
\| \| \| \| \| \|	The register class is all that's important for the pseudo instructions. We can use patterns to handle the different types. llvm-svn: 343709
*	[X86] Move Atomic binops to use WriteALURMW schedule class	Simon Pilgrim	2018-10-03	1	-4/+4
\| \| \| \| \| \|	These were being tagged as <WriteALULd, WriteRMW> instead of properly using the RMW sequence llvm-svn: 343705
*	[X86] Move Atomic CMPXCHG to WriteCMPXCHGRMW schedule class	Simon Pilgrim	2018-10-03	1	-5/+5
\| \| \| \|	llvm-svn: 343700
*	[codeview] Fix 32-bit x86 variable locations in realigned stack frames	Reid Kleckner	2018-10-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the .cv_fpo_stackalign directive so that we can define $T0, or the VFRAME virtual register, with it. This was overlooked in the initial implementation because unlike MSVC, we push CSRs before allocating stack space, so this value is only needed to describe local variable locations. Variables that the compiler now addresses via ESP are instead described as being stored at offsets from VFRAME, which for us is ESP after alignment in the prologue. This adds tests that show that we use the VFRAME register properly in our S_DEFRANGE records, and that we emit the correct FPO data to define it. Fixes PR38857 llvm-svn: 343603
*	[X86] Fix inline expansion for memset in x32	Craig Topper	2018-09-22	1	-21/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Similar to D51893 which was for memcpy Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52063 llvm-svn: 342796
*	[X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32	Craig Topper	2018-09-12	1	-21/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match. Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger. Reviewers: rnk, efriedma, MatzeB, echristo Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51893 llvm-svn: 342015
*	[x86/retpoline] Split the LLVM concept of retpolines into separate	Chandler Carruth	2018-08-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	subtarget features for indirect calls and indirect branches. This is in preparation for enabling only the call retpolines when using speculative load hardening. I've continued to use subtarget features for now as they continue to seem the best fit given the lack of other retpoline like constructs so far. The LLVM side is pretty simple. I'd like to eventually get rid of the old feature, but not sure what backwards compatibility issues that will cause. This does remove the "implies" from requesting an external thunk. This always seemed somewhat questionable and is now clearly not desirable -- you specify a thunk the same way no matter which set of things are getting retpolines. I really want to keep this nicely isolated from end users and just an LLVM implementation detail, so I've moved the `-mretpoline` flag in Clang to no longer rely on a specific subtarget feature by that name and instead to be directly handled. In some ways this is simpler, but in order to preserve existing behavior I've had to add some fallback code so that users who relied on merely passing -mretpoline-external-thunk continue to get the same behavior. We should eventually remove this I suspect (we have never tested that it works!) but I've not done that in this patch. Differential Revision: https://reviews.llvm.org/D51150 llvm-svn: 340515
*	[WebAssembly] Add isEHScopeReturn instruction property	Heejin Ahn	2018-08-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: So far, `isReturn` property is used to mean both a return instruction from a functon and the end of an EH scope, a scope that starts with a EH scope entry BB and ends with a catchret or a cleanupret instruction. Because WinEH uses funclets, all EH-scope-ending instructions are also real return instruction from a function. But for wasm, they only serve as the end marker of an EH scope but not a return instruction that exits a function. This mismatch caused incorrect prolog and epilog generation in wasm EH scopes. This patch fixes this. This patch is in the same vein with rL333045, which splits `MachineBasicBlock::isEHFuncletEntry` into `isEHFuncletEntry` and `isEHScopeEntry`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D50653 llvm-svn: 340325
*	[X86] Remove unnecessary AddedComplexity line. NFC	Craig Topper	2018-08-12	1	-1/+1
\| \| \| \| \| \|	The use of the or_is_add predicate already gives enough of a complexity boost to get the patterns ordered properly. llvm-svn: 339507
*	[X86] Change the MOV32ri64 pseudo instruction to def a GR64 directly instead ↵	Craig Topper	2018-08-11	1	-12/+4
\| \| \| \| \| \| \| \| \| \|	of wrapping it in a SUBREG_TO_REG. Now we switch to the subregister in expandPostRAPseudos where we already switched the opcode. This simplifies a few isel patterns that used the pseudo directly. And magically seems to have improved our ability to CSE it in the undef-label.ll test. llvm-svn: 339496
*	[SelectionDAG][X86][SystemZ] Add a generic ↵	Craig Topper	2018-08-07	1	-6/+0
\| \| \| \| \| \| \| \|	nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td Differential Revision: https://reviews.llvm.org/D50358 llvm-svn: 339156