summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [Power9][NFC]Make pre-inc-disable case more robustKewen Lin2018-12-171-28/+28
| | | | | | | With some patch adopted for Power9 vabsd* insns, some CHECKs can't get the expected results. But it's false alarm, we should update the case more robust. llvm-svn: 349325
* [Power9][NFC] add setb exploitation test caseKewen Lin2018-12-151-0/+1302
| | | | | | | | Add an original test case for setb before the exploitation actually takes effect, later we can check the difference. Differential Revision: https://reviews.llvm.org/D55696 llvm-svn: 349251
* [NFC][PowerPC] add verify-machineinstrs checkChen Zheng2018-12-131-1/+1
| | | | | | After rL349029 and rL348566, sj-ctr-loop.ll is ok for verify-machineinstrs check. llvm-svn: 349030
* [PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier.Chen Zheng2018-12-131-2/+21
| | | | | | Differential Revision: https://reviews.llvm.org/D55499 llvm-svn: 349029
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-133-8/+4
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016
* Revert r348843 "[CodeGen] Allow mempcy/memset to generate small overlapping ↵Clement Courbet2018-12-113-4/+8
| | | | | | | | stores." Breaks ARM/memcpy-inline.ll llvm-svn: 348844
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-113-8/+4
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 348843
* [Targets] Fixup incorrect targets in codemodel testsDavid Green2018-12-101-2/+2
| | | | llvm-svn: 348796
* [Targets] Add errors for tiny and kernel codemodel on targets that don't ↵David Green2018-12-071-0/+9
| | | | | | | | | | | support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585
* [PowerPC] Fix assert from machine verify pass that missing undef register flagZi Xuan Wu2018-12-0720-28/+28
| | | | | | | | | | | | | | | | | | | | Fix assert about using an undefined physical register in machine instruction verify pass. The reason is that register flag undef is missing when doing transformation from If Conversion Pass. ``` Bad machine code: Using an undefined physical register - function: func_65 - basic block: %bb.0 entry (0x10024740738) - instruction: BCLR killed $cr5lt, implicit $lr8, implicit $rm, implicit undef $x3 - operand 0: killed $cr5lt LLVM ERROR: Found 1 machine code errors. ``` There are also other existing testcases with same issue. So I add -verify-machineinstrs option to open verifying. Differential Revision: https://reviews.llvm.org/D55408 llvm-svn: 348566
* [DAGCombiner] use root SDLoc for all nodes created by logic foldSanjay Patel2018-12-071-4/+4
| | | | | | | | | | | If this is not a valid way to assign an SDLoc, then we get this wrong all over SDAG. I don't know enough about the SDAG to explain this. IIUC, theoretically, debug info is not supposed to affect codegen. But here it has clearly affected 3 different targets, and the x86 change is an actual improvement. llvm-svn: 348552
* [DAGCombiner] don't hoist logic op if operands have other uses, part 2Sanjay Patel2018-12-061-10/+9
| | | | | | | | | The PPC test with 2 extra uses seems clearly better by avoiding this transform. With 1 extra use, we also prevent an extra register move (although that might be an RA problem). The general rule should be to only make a change here if it is always profitable. The x86 diffs are all neutral. llvm-svn: 348518
* [PowerPC] add tests for hoisting bitwise logic; NFCSanjay Patel2018-12-061-0/+72
| | | | llvm-svn: 348516
* [PowerPC] Make no-PIC default to match GCC - LLVMStefan Pintilie2018-12-0479-664/+5187
| | | | | | | | Change the default for PowerPC LE to -fno-PIC. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 348298
* [PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instructionKang Zhang2018-12-031-0/+162
| | | | | | | | | | | | | Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 348109
* [PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making ↵Li Jia He2018-11-291-16/+16
| | | | | | | | | | | | | | | | the instruction selection Summary: A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative so for signed comparisons vs. unsigned ones, the input operands just need to be swapped. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54825 llvm-svn: 347831
* [PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction ↵Li Jia He2018-11-291-0/+602
| | | | | | | | | | | | | | | | | | | | | | | | | selection Add the following test case for the ISD::BR_CC node in the instruction selection define i64 @testi64slt(i64 %c1, i64 %c2, i64 %c3, i64 %c4, i64 %a1, i64 %a2) #0 { entry: %cmp1 = icmp eq i64 %c3, %c4 %cmp3tmp = icmp eq i64 %c1, %c2 %cmp3 = icmp slt i1 %cmp3tmp, %cmp1 br i1 %cmp3, label %iftrue, label %iffalse iftrue: ret i64 %a1 iffalse: ret i64 %a2 } The data type i64 can be replaced by i32, i64, float, double
 And condition codes can be replaced by: SETEQ, SETEN, SELT, SETLE, SETGT, SETGE,SETULT, SETULE, SSETGT, and SETUGE Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54824 llvm-svn: 347828
* [LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use ↵Craig Topper2018-11-262-120/+120
| | | | | | | | | | | | SplitVecOp_TruncateHelper for FP_TO_SINT/UINT. SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512. This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral. Differential Revision: https://reviews.llvm.org/D54906 llvm-svn: 347593
* Revert "[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction"Kang Zhang2018-11-261-161/+0
| | | | | | | | This reverts commits r347532. Forget add the option -mtriple powerpc64-unknown-linux-gnu. So other platform is error except for PowerPC. llvm-svn: 347534
* [PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instructionKang Zhang2018-11-261-0/+161
| | | | | | | | | | | | | Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 347532
* [DAGCombiner] form 'not' ops ahead of shifts (PR39657)Sanjay Patel2018-11-2215-54/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'), but that would not matter without this patch because DAGCombiner was reversing that transform. I think we need this transform in the backend regardless of what happens in IR to catch cases where the shift-xor is formed late from GEP or other ops. https://rise4fun.com/Alive/NC1 Name: shl Pre: (-1 << C2) == C1 %shl = shl i8 %x, C2 %r = xor i8 %shl, C1 => %not = xor i8 %x, -1 %r = shl i8 %not, C2 Name: shr Pre: (-1 u>> C2) == C1 %sh = lshr i8 %x, C2 %r = xor i8 %sh, C1 => %not = xor i8 %x, -1 %r = lshr i8 %not, C2 https://bugs.llvm.org/show_bug.cgi?id=39657 llvm-svn: 347478
* [PowerPC] Do not use vectors to codegen bswap with Altivec turned offNemanja Ivanovic2018-11-211-4/+28
| | | | | | | | | | | | | | We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 llvm-svn: 347376
* [PowerPC] Add Itineraries for STWU/STWUX etcJinsong Ji2018-11-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. This patch is for STWU/STWUX (IIC_LdStStoreUpd ) for P8. Since there are already multiple IIC for store update, this patch also merge IIC_LdStSTDU/IIC_LdStStoreUpd to IIC_LdStSTU IIC_LdStSTDUX to IIC_LdStSTUX and we add a new testcase in https://reviews.llvm.org/D54699 to show the difference. Differential Revision: https://reviews.llvm.org/D54700 llvm-svn: 347311
* [PowerPC][NFC]Add testcase for STWU scheduling checkJinsong Ji2018-11-201-0/+72
| | | | | | | | | | | | | | This patch add a STWU testcase for scheduling check. Currently P7/P8 which use itineraries are missing IIC_LdStStoreUpd, We use CHECK-ITIN prefix to check P7/P8, then use default for P9 (and future). We will fix the missing itineraries of IIC_LdStStoreUpd in following patch, and update this testcase to show the scheduling difference only there. Differential Revision: https://reviews.llvm.org/D54699 llvm-svn: 347310
* [PowerPC] Don't combine to bswap store on 1-byte truncating storeNemanja Ivanovic2018-11-201-0/+26
| | | | | | | | | | Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. llvm-svn: 347288
* [PowerPC][NFC] Add tests for vector fp <-> int conversionsNemanja Ivanovic2018-11-1616-0/+14768
| | | | | | | | | This NFC patch just adds test cases for conversions that currently require scalarization of vectors. An updcoming patch will change the legalization for these and it is more suitable on the review to show the diferences in code gen rather than just the new code gen. llvm-svn: 347090
* Revert "[PowerPC] Make no-PIC default to match GCC - LLVM"Stefan Pintilie2018-11-1680-515/+580
| | | | | | This reverts commit r347069 llvm-svn: 347076
* [PowerPC] Make no-PIC default to match GCC - LLVMStefan Pintilie2018-11-1680-580/+515
| | | | | | | | Set -fno-PIC as the default option. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 347069
* [PowerPC] Enhance the selection(ISD::VSELECT) of vector typeZi Xuan Wu2018-11-141-5/+98
| | | | | | | | | | To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding, which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel. Differential Revision: https://reviews.llvm.org/D49531 llvm-svn: 346824
* [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsicsZaara Syeda2018-11-052-48/+56
| | | | | | | | | | | | On Power9, we don't have patterns to select the following intrinsics: llvm.ppc.vsx.stxvw4x.be llvm.ppc.vsx.stxvd2x.be This patch adds support for these. Differential Revision: https://reviews.llvm.org/D53581 llvm-svn: 346148
* [PowerPC] Support constraint 'wi' in asmLi Jia He2018-11-012-0/+24
| | | | | | | | | | From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D53265 llvm-svn: 345810
* MachineOperand/MIParser: Do not print debug-use flag, infer itMatthias Braun2018-10-302-3/+3
| | | | | | | | | | | | | | The debug-use flag must be set exactly for uses on DBG_VALUEs. This is so obvious that it can be trivially inferred while parsing. This will reduce noise when printing while omitting an information that has little value to the user. The parser will keep recognizing the flag for compatibility with old `.mir` files. Differential Revision: https://reviews.llvm.org/D53903 llvm-svn: 345671
* Relax fast register allocator related test cases; NFCMatthias Braun2018-10-293-13/+13
| | | | | | | | | | | | | - Relex hard coded registers and stack frame sizes - Some test cleanups - Change phi-dbg.ll to match on mir output after phi elimination instead of going through the whole codegen pipeline. This is in preparation for https://reviews.llvm.org/D52010 I'm committing all the test changes upfront that work before and after independently. llvm-svn: 345532
* [PowerPC] Improve BUILD_VECTOR of 4 i32sLei Huang2018-10-261-104/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, for this node: vector int test(int a, int b, int c, int d) { return (vector int) { a, b, c, d }; } we get this on Power9: mtvsrdd 34, 5, 3 mtvsrdd 35, 6, 4 vmrgow 2, 3, 2 and this on Power8: mtvsrwz 0, 3 mtvsrwz 1, 5 mtvsrwz 2, 4 mtvsrwz 3, 6 xxmrghd 34, 1, 0 xxmrghd 35, 3, 2 vmrgow 2, 3, 2 This can be improved to this on LE Power9: rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrdd 34, 5, 3 and this on LE Power8 rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrd 34, 3 mtvsrd 35, 5 xxpermdi 34, 35, 34, 0 This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE. Differential Revision: https://reviews.llvm.org/D53494 llvm-svn: 345414
* [PowerPC] Fix some missed optimization opportunities in combineSetCCLi Jia He2018-10-261-56/+28
| | | | | | | | | | | For both operands are bool, short, int, long, long long, add the following optimization. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53360 llvm-svn: 345366
* [PowerPC][NFC] Add tests for some missed optimization opportunities in ↵Li Jia He2018-10-261-0/+436
| | | | | | | | | | | | | combineSetCC For both operands are bool, short, int, long, long long, add the following optimization test case. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53358 llvm-svn: 345365
* [PowerPC] Keep vector int to fp conversions in vector domainNemanja Ivanovic2018-10-261-0/+192
| | | | | | | | | | | | At present a v2i16 -> v2f64 convert is implemented by extracts to scalar, scalar converts, and merge back into a vector. Use vector converts instead, with the int data permuted into the proper position and extended if necessary. Patch by RolandF. Differential revision: https://reviews.llvm.org/D53346 llvm-svn: 345361
* [Power9] Add __float128 support in the backend for bitcast to a i128Stefan Pintilie2018-10-231-0/+53
| | | | | | | | | Add support to allow bit-casting from f128 to i128 and then extracting 64 bits from the result. Differential Revision: https://reviews.llvm.org/D49507 llvm-svn: 345053
* [PowerPC][NFC] Fix bugs in r+r to r+i conversionNemanja Ivanovic2018-10-221-12/+12
| | | | | | | | | | | | | | | | | | The D-Form VSX loads introduced in ISA 3.0 are not direct D-Form equivalent of the corresponding X-Forms since they only target the Altivec registers. Namely LXSSPX can load into any of the 64 VSX registers whereas LXSSP can only load into the upper 32 VSX registers. Similarly with the remaining affected instructions. There is currently no way that I can see to trigger the bug, but as we add other ways of exploiting these instructions, there may very well be instances that do. This is an NFC patch in practical terms since the changes it introduces can not be triggered without an MIR test. Differential revision: https://reviews.llvm.org/D53323 llvm-svn: 344894
* [PowerPC] avoid masking already-zero bits in BitPermutationSelectorHiroshi Inoue2018-10-124-13/+39
| | | | | | | | | | | | | | | The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable. ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value. This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node. VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them. VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits. This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero). Differential Revision: https://reviews.llvm.org/D48025 llvm-svn: 344347
* [DAG] Fix Big Endian in Load-Store forwardingNirav Dave2018-10-111-0/+16
| | | | | | | | | | | | | | Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272
* [DAGCombine] Improve Load-Store ForwardingNirav Dave2018-10-101-3/+2
| | | | | | | | | | | | | | | | | | Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142
* [PowerPC][NFC] Add a test case for extract and store patternsNemanja Ivanovic2018-10-101-0/+339
| | | | | | | An upcoming patch will change the codegen for these patterns. This test case is added now so that the patch can show the differences in codegen. llvm-svn: 344112
* [PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8QingShan Zhang2018-10-101-0/+24
| | | | | | | | | | For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52449 llvm-svn: 344109
* [DAGCombiner] Expand combining of FP logical ops to sign-setting FP opsNemanja Ivanovic2018-10-091-20/+4
| | | | | | | | | | | | | | | | | We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093
* [PowerPC][NFC] Commit nabs test case in preparation for committing D44548Nemanja Ivanovic2018-10-091-0/+64
| | | | | | | This just adds the test case so that the different code gen is clearly visible when the DAG Combine lands. llvm-svn: 344091
* [PowerPC] Implement hasBitPreservingFPLogic for types that can be supportedNemanja Ivanovic2018-10-091-0/+126
| | | | | | | | | | This is the PPC-specific non-controversial part of https://reviews.llvm.org/D44548 that simply enables this combine for PPC since PPC has these instructions. This commit will allow the target-independent portion to be truly target independent. llvm-svn: 344077
* Fix buildbot failures with the newly added test case (triple was missing).Nemanja Ivanovic2018-10-091-1/+1
| | | | llvm-svn: 344037
* [PowerPC] Remove self-copies in pre-emit peepholeNemanja Ivanovic2018-10-091-0/+128
| | | | | | | | | | | | | | | | | | There are occasionally instances where AADB rewrites registers in such a way that a reg-reg copy becomes a self-copy. Such an instruction is obviously redundant and can be removed. This patch does precisely that. Note that this will not remove various nop's that we insert (which are themselves just self-copies). The reason those are left alone is that all of them have their own opcodes (that just encode to a self-copy). What prompted this patch is the fact that these self-copies sometimes end up using registers that make the instruction a priority-setting nop, thereby having a significant effect on performance. Differential revision: https://reviews.llvm.org/D52432 llvm-svn: 344036
* [PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads.Stefan Pintilie2018-10-011-0/+16
| | | | | | | | | | | Going from XForm Load to DSForm Load requires that the immediate be 4 byte aligned. If we are not aligned we must leave the load as LDX (XForm). This bug is causing a compile-time failure in the benchmark h264ref. Differential Revision: https://reviews.llvm.org/D51988 llvm-svn: 343525
OpenPOWER on IntegriCloud