bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[MachineLICM] Hoist TOC-based address instructions	Lei Huang	2017-06-15	1	-0/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
*	[PowerPC] fix potential verification errors on CFENCE8	Hiroshi Inoue	2017-06-15	3	-12/+12
\| \| \| \| \| \| \| \|	This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs. Differential Revision: https://reviews.llvm.org/D34208 llvm-svn: 305479
*	Revert r304907 as it is causing some failures that I cannot reproduce.	Nemanja Ivanovic	2017-06-14	5	-567/+6
\| \| \| \| \| \|	Reverting this until a test case can be provided to aid the investigation. llvm-svn: 305372
*	[PowerPC] Match vec_revb builtins to P9 instructions.	Tony Jiang	2017-06-12	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \|	Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 llvm-svn: 305214
*	[Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.	Tony Jiang	2017-06-12	1	-0/+263
\| \| \| \| \| \| \| \| \| \| \|	Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 llvm-svn: 305210
*	[PowerPC] add memcmp test with one constant operand and equality cmp; NFC	Sanjay Patel	2017-06-09	1	-3/+29
\| \| \| \|	llvm-svn: 305131
*	[CGP] don't expand a memcmp with nobuiltin attribute	Sanjay Patel	2017-06-08	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 llvm-svn: 305007
*	[PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64	Guozhi Wei	2017-06-08	2	-14/+33
\| \| \| \| \| \| \| \| \| \|	In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 305001
*	[Power9] Exploit vector integer extend instructions	Zaara Syeda	2017-06-08	1	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
*	[PowerPC] add memcmp test with nobuiltin attr; NFC	Sanjay Patel	2017-06-08	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	In SDAG, we don't expand libcalls with a nobuiltin attribute. It's not clear if that's correct from the existing code comment: "Don't do the check if marked as nobuiltin for some reason." ...adding a test here either way to show that there is currently a different behavior implemented in the CGP-based expansion. llvm-svn: 304991
*	[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion	Sanjay Patel	2017-06-08	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987
*	[CGP] avoid zext/trunc of a memcmp expansion compare	Sanjay Patel	2017-06-07	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923
*	[PowerPC] Eliminate integer compare instructions - vol. 5	Nemanja Ivanovic	2017-06-07	5	-6/+567
\| \| \| \| \| \| \| \|	Adds handling for i64 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33720 llvm-svn: 304907
*	[PowerPC] Eliminate integer compare instructions - vol. 3	Nemanja Ivanovic	2017-06-07	9	-39/+798
\| \| \| \| \| \| \| \|	Adds handling for i32 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33718 llvm-svn: 304901
*	[CGP / PowerPC] use direct compares if there's only one load per block in ↵	Sanjay Patel	2017-06-07	1	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 llvm-svn: 304849
*	[PowerPC] auto-generate full checks and increase test coverage	Sanjay Patel	2017-06-06	1	-77/+160
\| \| \| \| \| \| \|	3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0. I changed that to test 4, 7, and 16 bytes, so we can see how those differ. llvm-svn: 304838
*	RegisterScavenging: Add ScavengerTest pass	Matthias Braun	2017-06-02	1	-0/+149
\| \| \| \| \| \| \| \| \|	This pass allows to run the register scavenging independently of PrologEpilogInserter to allow targeted testing. Also adds some basic register scavenging tests. llvm-svn: 304606
*	[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.	Sean Fertile	2017-05-31	1	-0/+49
\| \| \| \| \| \| \| \| \| \|	Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors. Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D33656 llvm-svn: 304317
*	[PPC] Inline expansion of memcmp	Zaara Syeda	2017-05-31	3	-0/+402
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
*	[PowerPC] Fix a performance bug for PPC::XXPERMDI.	Tony Jiang	2017-05-31	1	-0/+307
\| \| \| \| \| \| \| \| \| \|	There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 llvm-svn: 304298
*	[PowerPC] Eliminate integer compare instructions - vol. 3	Nemanja Ivanovic	2017-05-31	6	-9/+605
\| \| \| \| \| \| \| \| \|	This patch builds upon https://reviews.llvm.org/rL302810 to add handling for the 64-bit SETEQ patterns. Differential Revision: https://reviews.llvm.org/D33369 llvm-svn: 304286
*	[PowerPC] Eliminate integer compare instructions - vol. 2	Nemanja Ivanovic	2017-05-31	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \|	This patch builds upon https://reviews.llvm.org/rL302810 to add handling for bitwise logical operations in general purpose registers. The idea is to keep the values in GPRs as long as possible - only extracting them to a condition register bit when no further operations are to be done. Differential Revision: https://reviews.llvm.org/D31851 llvm-svn: 304282
*	[AntiDepBreaker] Revert r299124 and add a test.	Tim Shen	2017-05-30	1	-330/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: AntiDepBreaker intends to add all live-outs, including the implicit CSRs, in StartBlock. r299124 was done without understanding that intention. Now with the live-ins propagated correctly (D32464), we can revert this change. Reviewers: MatzeB, qcolombet Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33697 llvm-svn: 304251
*	LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI	Matthias Braun	2017-05-26	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal addPristines() function must be called on an empty set or it may accidentally reset saved registers. - addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 304001
*	Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"	Matthias Braun	2017-05-26	1	-52/+0
\| \| \| \| \| \| \| \| \| \|	Tentatively revert this to see if it fixes the buildbot stage2 breakages. This reverts commit r303938. This reverts commit r303954. llvm-svn: 303960
*	Test for r303938	Matthias Braun	2017-05-26	1	-0/+52
\| \| \| \|	llvm-svn: 303954
*	[PPC] Fix atomics lowering in DAG lowering.	Tim Shen	2017-05-25	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \|	I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
*	[PowerPC] Fix a performance bug for PPC::XXSLDWI.	Tony Jiang	2017-05-24	4	-37/+344
\| \| \| \| \| \| \| \|	There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction, this patch recognizes them and does the selection to improve the PPC performance. llvm-svn: 303822
*	P9: D-form vector load/store. Differential Revision: ↵	Zaara Syeda	2017-05-24	10	-209/+209
\| \| \| \| \| \|	https://reviews.llvm.org/D33248 llvm-svn: 303780
*	Summary	Hiroshi Inoue	2017-05-21	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass. This patch improves this optimization to eliminate more compare instructions in two types of common case. - comparison against a constant 1 or -1 The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant. This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0". With this patch, compare can be eliminated in the following code sequence, as an example. uint64_t a, b; if ((a \| b) & 0x8000000000000000ull) { ... } else { ... } - andi for 32-bit comparison on PPC64 Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check. %1 = and i32 %a, 10 %2 = icmp ne i32 %1, 0 br i1 %2, label %foo, label %bar In this simple example, LLVM generates andi. + cmplwi + beq on PPC64. This patch make it possible to eliminate the cmplwi for this case. I added andi. for optimization targets if it is safe to do so. Differential Revision: https://reviews.llvm.org/D30081 llvm-svn: 303500
*	CodeGen: Power: Add lowering for shifts of v1i128.	Kyle Butt	2017-05-17	1	-4/+88
\| \| \| \| \| \| \| \| \| \| \| \|	When legalizing vector operations on vNi128, they will be split to v1i128 because that is a legal type on ppc64, but then the compiler will crash in selection dag because it fails to select for these operations. This patch fixes shift operations. Logical shift right and left shift can be performed in the vector unit, but algebraic shift right requires being split. Differential Revision: https://reviews.llvm.org/D32774 llvm-svn: 303307
*	[PPC] Properly update register save area offsets	Krzysztof Parzyszek	2017-05-17	3	-0/+157
\| \| \| \| \| \| \| \| \| \| \| \|	The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257
*	[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC.	Tim Shen	2017-05-16	1	-6/+6
\| \| \| \| \| \|	Differential Revisions: https://reviews.llvm.org/D32763 llvm-svn: 303209
*	[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.	Tim Shen	2017-05-16	4	-18/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205
*	Elide stores which are overwritten without being observed.	Nirav Dave	2017-05-16	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198
*	CodeGen: BlockPlacement: Increase tail duplication size for O3.	Kyle Butt	2017-05-15	1	-3/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 llvm-svn: 303084
*	[PPC] Change the register constraint of the first source operand of ↵	Guozhi Wei	2017-05-11	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834
*	[PowerPC] Eliminate integer compare instructions - vol. 1	Nemanja Ivanovic	2017-05-11	13	-6/+1658
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810
*	Add extra operand to CALLSEQ_START to keep frame part set up previously	Serge Pavlov	2017-05-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
*	[PPC] When restoring R30 (PIC base pointer), mark it as <def>	Krzysztof Parzyszek	2017-05-04	1	-0/+30
\| \| \| \| \| \| \| \| \|	This happened on the PPC32/SVR4 path and was discovered when building FreeBSD on PPC32. It was a typo-class error in the frame lowering code. This fixes PR26519. llvm-svn: 302183
*	[PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single ↵	Tim Shen	2017-05-03	1	-39/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction Summary: This is the corresponding llvm change to D28037 to ensure no performance regression. Reviewers: bogner, kbarton, hfinkel, iteratee, echristo Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28329 llvm-svn: 301990
*	[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE	Nemanja Ivanovic	2017-05-02	6	-80/+89
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 llvm-svn: 301892
*	[StackMaps] Increase the size of the "location size" field	Sanjoy Das	2017-04-28	2	-71/+199
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In some cases LLVM (especially the SLP vectorizer) will create vectors that are 256 bytes (or larger). Given that this is intentional[0] is likely to get more common, this patch updates the StackMap binary format to deal with the spill locations for said vectors. This change also bumps the stack map version from 2 to 3. [0]: https://reviews.llvm.org/D32533#738350 Reviewers: reames, kavon, skatkov, javed.absar Subscribers: mcrosier, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D32629 llvm-svn: 301615
*	Don't emit CFI instructions at the end of a function	Adrian Prantl	2017-04-24	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When functions are terminated by unreachable instructions, the last instruction might trigger a CFI instruction to be generated. However, emitting it would be be illegal since the function (and thus the FDE the CFI is in) has already ended with the previous instruction. Darwin's dwarfdump --verify --eh-frame complains about this and the specification supports this. Relevant bits from the DWARF 5 standard (6.4 Call Frame Information): "[The] address_range [field in an FDE]: The number of bytes of program instructions described by this entry." "Row creation instructions: [...] The new location value is always greater than the current one." The first quotation implies that a CFI cannot describe a target address outside of the enclosing FDE's range. rdar://problem/26244988 Differential Revision: https://reviews.llvm.org/D32246 llvm-svn: 301219
*	[DAG] add splat vector support for 'xor' in SimplifyDemandedBits	Sanjay Patel	2017-04-19	1	-4/+2
\| \| \| \| \| \| \| \| \|	This allows forming more 'not' ops, so we get improvements for ISAs that have and-not. Follow-up to: https://reviews.llvm.org/rL300725 llvm-svn: 300763
*	[PowerPC] add test and auto-generate checks; NFC	Sanjay Patel	2017-04-19	1	-19/+33
\| \| \| \|	llvm-svn: 300700
*	[PowerPC] multiply-with-overflow might use the CTR register	Hal Finkel	2017-04-11	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \|	Check the legality of ISD::[US]MULO to see whether Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus, will use the CTR register). Fixes PR32485. Patch by Tim Neumann! Differential Revision: https://reviews.llvm.org/D31790 llvm-svn: 299910
*	Add address space mangling to lifetime intrinsics	Matt Arsenault	2017-04-10	8	-51/+51
\| \| \| \| \| \|	In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876
*	Turn on -addr-sink-using-gep by default.	Eli Friedman	2017-04-06	2	-3/+2
\| \| \| \| \| \| \| \| \|	The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723
*	[DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to ↵	Sanjay Patel	2017-04-05	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542