bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PowerPC] Make no-PIC default to match GCC - LLVM	Stefan Pintilie	2018-12-04	79	-664/+5187
\| \| \| \| \| \| \| \|	Change the default for PowerPC LE to -fno-PIC. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 348298
*	[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction	Kang Zhang	2018-12-03	1	-0/+162
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 348109
*	[PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making ↵	Li Jia He	2018-11-29	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the instruction selection Summary: A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative so for signed comparisons vs. unsigned ones, the input operands just need to be swapped. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54825 llvm-svn: 347831
*	[PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction ↵	Li Jia He	2018-11-29	1	-0/+602
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	selection Add the following test case for the ISD::BR_CC node in the instruction selection define i64 @testi64slt(i64 %c1, i64 %c2, i64 %c3, i64 %c4, i64 %a1, i64 %a2) #0 { entry: %cmp1 = icmp eq i64 %c3, %c4 %cmp3tmp = icmp eq i64 %c1, %c2 %cmp3 = icmp slt i1 %cmp3tmp, %cmp1 br i1 %cmp3, label %iftrue, label %iffalse iftrue: ret i64 %a1 iffalse: ret i64 %a2 } The data type i64 can be replaced by i32, i64, float, double  And condition codes can be replaced by: SETEQ, SETEN, SELT, SETLE, SETGT, SETGE,SETULT, SETULE, SSETGT, and SETUGE Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54824 llvm-svn: 347828
*	[LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use ↵	Craig Topper	2018-11-26	2	-120/+120
\| \| \| \| \| \| \| \| \| \| \| \|	SplitVecOp_TruncateHelper for FP_TO_SINT/UINT. SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512. This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral. Differential Revision: https://reviews.llvm.org/D54906 llvm-svn: 347593
*	Revert "[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction"	Kang Zhang	2018-11-26	1	-161/+0
\| \| \| \| \| \| \| \|	This reverts commits r347532. Forget add the option -mtriple powerpc64-unknown-linux-gnu. So other platform is error except for PowerPC. llvm-svn: 347534
*	[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction	Kang Zhang	2018-11-26	1	-0/+161
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 347532
*	[DAGCombiner] form 'not' ops ahead of shifts (PR39657)	Sanjay Patel	2018-11-22	15	-54/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'), but that would not matter without this patch because DAGCombiner was reversing that transform. I think we need this transform in the backend regardless of what happens in IR to catch cases where the shift-xor is formed late from GEP or other ops. https://rise4fun.com/Alive/NC1 Name: shl Pre: (-1 << C2) == C1 %shl = shl i8 %x, C2 %r = xor i8 %shl, C1 => %not = xor i8 %x, -1 %r = shl i8 %not, C2 Name: shr Pre: (-1 u>> C2) == C1 %sh = lshr i8 %x, C2 %r = xor i8 %sh, C1 => %not = xor i8 %x, -1 %r = lshr i8 %not, C2 https://bugs.llvm.org/show_bug.cgi?id=39657 llvm-svn: 347478
*	[PowerPC] Do not use vectors to codegen bswap with Altivec turned off	Nemanja Ivanovic	2018-11-21	1	-4/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 llvm-svn: 347376
*	[PowerPC] Add Itineraries for STWU/STWUX etc	Jinsong Ji	2018-11-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. This patch is for STWU/STWUX (IIC_LdStStoreUpd ) for P8. Since there are already multiple IIC for store update, this patch also merge IIC_LdStSTDU/IIC_LdStStoreUpd to IIC_LdStSTU IIC_LdStSTDUX to IIC_LdStSTUX and we add a new testcase in https://reviews.llvm.org/D54699 to show the difference. Differential Revision: https://reviews.llvm.org/D54700 llvm-svn: 347311
*	[PowerPC][NFC]Add testcase for STWU scheduling check	Jinsong Ji	2018-11-20	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch add a STWU testcase for scheduling check. Currently P7/P8 which use itineraries are missing IIC_LdStStoreUpd, We use CHECK-ITIN prefix to check P7/P8, then use default for P9 (and future). We will fix the missing itineraries of IIC_LdStStoreUpd in following patch, and update this testcase to show the scheduling difference only there. Differential Revision: https://reviews.llvm.org/D54699 llvm-svn: 347310
*	[PowerPC] Don't combine to bswap store on 1-byte truncating store	Nemanja Ivanovic	2018-11-20	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. llvm-svn: 347288
*	[PowerPC][NFC] Add tests for vector fp <-> int conversions	Nemanja Ivanovic	2018-11-16	16	-0/+14768
\| \| \| \| \| \| \| \| \|	This NFC patch just adds test cases for conversions that currently require scalarization of vectors. An updcoming patch will change the legalization for these and it is more suitable on the review to show the diferences in code gen rather than just the new code gen. llvm-svn: 347090
*	Revert "[PowerPC] Make no-PIC default to match GCC - LLVM"	Stefan Pintilie	2018-11-16	80	-515/+580
\| \| \| \| \| \|	This reverts commit r347069 llvm-svn: 347076
*	[PowerPC] Make no-PIC default to match GCC - LLVM	Stefan Pintilie	2018-11-16	80	-580/+515
\| \| \| \| \| \| \| \|	Set -fno-PIC as the default option. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 347069
*	[PowerPC] Enhance the selection(ISD::VSELECT) of vector type	Zi Xuan Wu	2018-11-14	1	-5/+98
\| \| \| \| \| \| \| \| \| \|	To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding, which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel. Differential Revision: https://reviews.llvm.org/D49531 llvm-svn: 346824
*	[Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics	Zaara Syeda	2018-11-05	2	-48/+56
\| \| \| \| \| \| \| \| \| \| \| \|	On Power9, we don't have patterns to select the following intrinsics: llvm.ppc.vsx.stxvw4x.be llvm.ppc.vsx.stxvd2x.be This patch adds support for these. Differential Revision: https://reviews.llvm.org/D53581 llvm-svn: 346148
*	[PowerPC] Support constraint 'wi' in asm	Li Jia He	2018-11-01	2	-0/+24
\| \| \| \| \| \| \| \| \| \|	From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D53265 llvm-svn: 345810
*	MachineOperand/MIParser: Do not print debug-use flag, infer it	Matthias Braun	2018-10-30	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The debug-use flag must be set exactly for uses on DBG_VALUEs. This is so obvious that it can be trivially inferred while parsing. This will reduce noise when printing while omitting an information that has little value to the user. The parser will keep recognizing the flag for compatibility with old `.mir` files. Differential Revision: https://reviews.llvm.org/D53903 llvm-svn: 345671
*	Relax fast register allocator related test cases; NFC	Matthias Braun	2018-10-29	3	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Relex hard coded registers and stack frame sizes - Some test cleanups - Change phi-dbg.ll to match on mir output after phi elimination instead of going through the whole codegen pipeline. This is in preparation for https://reviews.llvm.org/D52010 I'm committing all the test changes upfront that work before and after independently. llvm-svn: 345532
*	[PowerPC] Improve BUILD_VECTOR of 4 i32s	Lei Huang	2018-10-26	1	-104/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, for this node: vector int test(int a, int b, int c, int d) { return (vector int) { a, b, c, d }; } we get this on Power9: mtvsrdd 34, 5, 3 mtvsrdd 35, 6, 4 vmrgow 2, 3, 2 and this on Power8: mtvsrwz 0, 3 mtvsrwz 1, 5 mtvsrwz 2, 4 mtvsrwz 3, 6 xxmrghd 34, 1, 0 xxmrghd 35, 3, 2 vmrgow 2, 3, 2 This can be improved to this on LE Power9: rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrdd 34, 5, 3 and this on LE Power8 rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrd 34, 3 mtvsrd 35, 5 xxpermdi 34, 35, 34, 0 This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE. Differential Revision: https://reviews.llvm.org/D53494 llvm-svn: 345414
*	[PowerPC] Fix some missed optimization opportunities in combineSetCC	Li Jia He	2018-10-26	1	-56/+28
\| \| \| \| \| \| \| \| \| \| \|	For both operands are bool, short, int, long, long long, add the following optimization. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53360 llvm-svn: 345366
*	[PowerPC][NFC] Add tests for some missed optimization opportunities in ↵	Li Jia He	2018-10-26	1	-0/+436
\| \| \| \| \| \| \| \| \| \| \| \| \|	combineSetCC For both operands are bool, short, int, long, long long, add the following optimization test case. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53358 llvm-svn: 345365
*	[PowerPC] Keep vector int to fp conversions in vector domain	Nemanja Ivanovic	2018-10-26	1	-0/+192
\| \| \| \| \| \| \| \| \| \| \| \|	At present a v2i16 -> v2f64 convert is implemented by extracts to scalar, scalar converts, and merge back into a vector. Use vector converts instead, with the int data permuted into the proper position and extended if necessary. Patch by RolandF. Differential revision: https://reviews.llvm.org/D53346 llvm-svn: 345361
*	[Power9] Add __float128 support in the backend for bitcast to a i128	Stefan Pintilie	2018-10-23	1	-0/+53
\| \| \| \| \| \| \| \| \|	Add support to allow bit-casting from f128 to i128 and then extracting 64 bits from the result. Differential Revision: https://reviews.llvm.org/D49507 llvm-svn: 345053
*	[PowerPC][NFC] Fix bugs in r+r to r+i conversion	Nemanja Ivanovic	2018-10-22	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The D-Form VSX loads introduced in ISA 3.0 are not direct D-Form equivalent of the corresponding X-Forms since they only target the Altivec registers. Namely LXSSPX can load into any of the 64 VSX registers whereas LXSSP can only load into the upper 32 VSX registers. Similarly with the remaining affected instructions. There is currently no way that I can see to trigger the bug, but as we add other ways of exploiting these instructions, there may very well be instances that do. This is an NFC patch in practical terms since the changes it introduces can not be triggered without an MIR test. Differential revision: https://reviews.llvm.org/D53323 llvm-svn: 344894
*	[PowerPC] avoid masking already-zero bits in BitPermutationSelector	Hiroshi Inoue	2018-10-12	4	-13/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable. ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value. This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node. VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them. VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits. This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero). Differential Revision: https://reviews.llvm.org/D48025 llvm-svn: 344347
*	[DAG] Fix Big Endian in Load-Store forwarding	Nirav Dave	2018-10-11	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272
*	[DAGCombine] Improve Load-Store Forwarding	Nirav Dave	2018-10-10	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142
*	[PowerPC][NFC] Add a test case for extract and store patterns	Nemanja Ivanovic	2018-10-10	1	-0/+339
\| \| \| \| \| \| \|	An upcoming patch will change the codegen for these patterns. This test case is added now so that the patch can show the differences in codegen. llvm-svn: 344112
*	[PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8	QingShan Zhang	2018-10-10	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52449 llvm-svn: 344109
*	[DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops	Nemanja Ivanovic	2018-10-09	1	-20/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093
*	[PowerPC][NFC] Commit nabs test case in preparation for committing D44548	Nemanja Ivanovic	2018-10-09	1	-0/+64
\| \| \| \| \| \| \|	This just adds the test case so that the different code gen is clearly visible when the DAG Combine lands. llvm-svn: 344091
*	[PowerPC] Implement hasBitPreservingFPLogic for types that can be supported	Nemanja Ivanovic	2018-10-09	1	-0/+126
\| \| \| \| \| \| \| \| \| \|	This is the PPC-specific non-controversial part of https://reviews.llvm.org/D44548 that simply enables this combine for PPC since PPC has these instructions. This commit will allow the target-independent portion to be truly target independent. llvm-svn: 344077
*	Fix buildbot failures with the newly added test case (triple was missing).	Nemanja Ivanovic	2018-10-09	1	-1/+1
\| \| \| \|	llvm-svn: 344037
*	[PowerPC] Remove self-copies in pre-emit peephole	Nemanja Ivanovic	2018-10-09	1	-0/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are occasionally instances where AADB rewrites registers in such a way that a reg-reg copy becomes a self-copy. Such an instruction is obviously redundant and can be removed. This patch does precisely that. Note that this will not remove various nop's that we insert (which are themselves just self-copies). The reason those are left alone is that all of them have their own opcodes (that just encode to a self-copy). What prompted this patch is the fact that these self-copies sometimes end up using registers that make the instruction a priority-setting nop, thereby having a significant effect on performance. Differential revision: https://reviews.llvm.org/D52432 llvm-svn: 344036
*	[PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads.	Stefan Pintilie	2018-10-01	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \|	Going from XForm Load to DSForm Load requires that the immediate be 4 byte aligned. If we are not aligned we must leave the load as LDX (XForm). This bug is causing a compile-time failure in the benchmark h264ref. Differential Revision: https://reviews.llvm.org/D51988 llvm-svn: 343525
*	[PHIElimination] Update the regression test for PR16508	Bjorn Pettersson	2018-09-30	2	-28/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When PR16508 was solved (in rL185363) a regression test was added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll. I discovered that the test case no longer reproduced the scenario from PR16508. This problem could have been amended by adding an extra RUN line with "-O1" (or possibly "-O0"), but instead I added a mir-reproducer test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir to get a reproducer that is less sensitive to changes in earlier passes (including O-level). While being at it I also corrected a code comment in PHIElimination::EliminatePHINodes that has been incorrect since the related bugfix from rL185363. Reviewers: MatzeB, hfinkel Reviewed By: MatzeB Subscribers: nemanjai, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52553 llvm-svn: 343416
*	[PowerPC] optimize conditional branch on CRSET/CRUNSET	Hiroshi Inoue	2018-09-26	2	-0/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET. Other optimizers, such as block placement, may generate such code and hence I do this at the very end of the optimization in pre-emit peephole pass. A conditional branch based on a constant is eliminated or converted into unconditional branch. Also CRSET/CRUNSET is eliminated if the condition code register is not used by instruction other than the branch to be optimized. Differential Revision: https://reviews.llvm.org/D52345 llvm-svn: 343100
*	[Power9] [LLVM] Add __float128 exponent GET and SET builtins	Stefan Pintilie	2018-09-24	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added __builtin_vsx_scalar_extract_expq __builtin_vsx_scalar_insert_exp_qp Builtins should behave the same way as in GCC. Differential Revision: https://reviews.llvm.org/D48185 llvm-svn: 342910
*	[PowerPC] Support operand modifier 'x' in inline asm	Zaara Syeda	2018-09-24	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \|	gcc uses operand modifier 'x' in inline asm for VSX registers. Without this modifier, instructions which use VSX numbering for their operands are printed as VMX registers. This patch adds support for the operand modifier 'x'. Differential Revision: https://reviews.llvm.org/D52244 llvm-svn: 342882
*	[PowerPC] Fix the assert of combineBVOfConsecutiveLoads when element num is 1	QingShan Zhang	2018-09-20	1	-0/+17
\| \| \| \| \| \| \| \| \| \|	Building a vector out of multiple loads can be converted to a load of the vector type if the loads are consecutive. But the special condition is that the element number is 1, such as <1 x i128>. So just early exit to fix the assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52072 llvm-svn: 342611
*	[PowerPC] Do not emit record-form rotates when record-form andi/andis suffices	Nemanja Ivanovic	2018-09-18	3	-22/+83
\| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to the previous patch that eliminated some of the rotates. With this addition, we will also emit the record-form andis. This patch increases the number of record-form rotates we eliminate by more than 70%. Differential revision: https://reviews.llvm.org/D44897 llvm-svn: 342478
*	[PowerPC] Optimize compares fed by ANDISo	Nemanja Ivanovic	2018-09-18	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both ANDIo and ANDISo (and the 64-bit versions) are record-form instructions. When optimizing compares, we handle the former in order to eliminate the compare instruction but not the latter. This patch just adds the latter to the set of instructions we optimize. The reason these instructions need to be handled separately is that they are not part of the RecFormRel map (since they don't have a non-record-form). The missing "and-immediate-shifted" is just an oversight in the initial implementation. Differential revision: https://reviews.llvm.org/D51353 llvm-svn: 342472
*	Remove trailing whitespace introduced in r342440.	Alexander Kornienko	2018-09-18	1	-3/+3
\| \| \| \|	llvm-svn: 342463
*	[PowerPC] Add Itineraries of IIC_IntMulHD for P7/P8	QingShan Zhang	2018-09-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. Patch By: jsji (Jinsong Ji) Differential Revision: https://reviews.llvm.org/D52040 llvm-svn: 342441
*	[PowerPC][NFC] Add a mulld testcase for scheduling check.	QingShan Zhang	2018-09-18	1	-0/+53
\| \| \| \| \| \| \| \| \|	This patch add a mulld testcase for scheduling check. Patch By: jsji (Jinsong Ji) Differential Revision: https://reviews.llvm.org/D52039 llvm-svn: 342440
*	[PowerPC] Fix label address calculation for ppc64	Strahinja Petrovic	2018-09-17	1	-0/+21
\| \| \| \| \| \| \| \|	This patch fixes calculating address of label for non-pic ppc64. Differential Revision: https://reviews.llvm.org/D50965 llvm-svn: 342368
*	[PowerPC] Fix the calling convention for i1 arguments on PPC32	Lion Yang	2018-09-14	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Integer types smaller than i32 must be extended to i32 by default. The feature "crbits" introduced at r202451 handles i1 as a special case, but it did not extend properly. The caller was, therefore, passing i1 stack arguments by writing 0/1 to the first byte of the 4-byte stack object and callee was reading the first byte for the value. "crbits" is enabled if the optimization level is greater than 1, which is very common in "release builds". Such discrepancies with ABI specification also introduces potential incompatibility with programs or libraries built with other compilers e.g. GCC. Fixes PR38661 Reviewers: hfinkel, cuviper Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D51108 llvm-svn: 342288
*	[PowerPC] Combine ADD to ADDZE	QingShan Zhang	2018-09-07	1	-0/+172
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On the ppc64le platform, if ir has the following form, define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 { entry: %cmp = icmp ne i64 %z, CONSTANT (-32767 <= CONSTANT <= 32768) %conv1 = zext i1 %cmp to i64 %add = add nsw i64 %conv1, %x ret i64 %add } we can optimize it to the form below. when C == 0 --> addze X, (addic Z, -1)) / add X, (zext(setne Z, C))-- \ when -32768 <= -C <= 32767 && C != 0 --> addze X, (addic (addi Z, -C), -1) Patch By: HLJ2009 (Li Jia He) Differential Revision: https://reviews.llvm.org/D51403 Reviewed By: Nemanjai llvm-svn: 341634