summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc ↵Kang Zhang2019-02-252-6/+32
| | | | | | | | | | | | | | | | | | | | instruction and clean up related asserts Summary: Fast selection of llvm fptoi & fptrunc instructions is not handled well about VSX instruction support. We'd use VSX float convert integer instruction instead of non-vsx float convert integer instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is openeded. For float trunc instruction, we do this silimar work like float convert integer instruction to try to use VSX instruction. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D58430 llvm-svn: 354762
* Disable big-endian constant store merges from rL354676.Nirav Dave2019-02-221-5/+7
| | | | llvm-svn: 354677
* [DAGCombine] Fold overlapping constant storesNirav Dave2019-02-221-12/+6
| | | | | | | | | | | | | | | Fold a smaller constant store into larger constant stores immediately preceeding it. Reviewers: rnk, courbet Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58468 llvm-svn: 354676
* [PPC] Add store merging testcase.Nirav Dave2019-02-211-0/+51
| | | | llvm-svn: 354595
* [PowerPC] exploit P9 instruction maddld.Chen Zheng2019-02-201-0/+115
| | | | | | Differential Revision: https://reviews.llvm.org/D58364 llvm-svn: 354427
* [PowerPC][NFC] Added tests for prologue and epilogue code gen.Stefan Pintilie2019-02-134-0/+494
| | | | | | | | | | Added four test files to check the existing behaviour of prologue and epilogue code generation. This patch was done as a setup for the upcoming patch listed on Phabricator that will change how the prologue and epilogue work. The upcoming patch is: https://reviews.llvm.org/D42590 llvm-svn: 353994
* [DAGCombiner] convert logic-of-setcc into bit magic (PR40611)Sanjay Patel2019-02-121-10/+10
| | | | | | | | | | | | | | | | | | | | If we're comparing some value for equality against 2 constants and those constants have an absolute difference of just 1 bit, then we can offset and mask off that 1 bit and reduce to a single compare against zero: and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) --> setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq https://rise4fun.com/Alive/XslKj This transform is disabled by default using a TLI hook ("convertSetCCLogicToBitwiseLogic()"). That should be overridden for AArch64, MIPS, Sparc and possibly others based on the asm shown in: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353859
* [PowerPC] Regenerate testSimon Pilgrim2019-02-121-153/+170
| | | | llvm-svn: 353851
* [PowerPC] add tests for logic of setcc (PR40611); NFCSanjay Patel2019-02-121-0/+30
| | | | llvm-svn: 353788
* [PowerPC] Avoid scalarization of vector truncateRoland Froese2019-02-111-258/+35
| | | | | | | | The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead. Differential Revision: https://reviews.llvm.org/D56507 llvm-svn: 353724
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-0/+48
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [DAGCombiner] fold add/sub with bool operand based on target's boolean contentsSanjay Patel2019-02-072-10/+6
| | | | | | | | | | | | | | | | | | | I noticed that we are missing this canonicalization in IR: rL352515 ...and then realized that we don't get this right in SDAG either, so this has to be fixed first regardless of what we choose to do in IR. The existing fold was limited to scalars and using the wrong predicate to guard the transform. We have a boolean contents TLI query that can be used to decide which direction to fold. This may eventually lead back to the problems/question in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but it makes no difference to that yet. Differential Revision: https://reviews.llvm.org/D57401 llvm-svn: 353433
* [PowerPC] Add vector truncate test to prep for D56507 NFCRoland Froese2019-02-061-0/+420
| | | | llvm-svn: 353344
* [LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with ↵Alina Sbirlea2019-02-061-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | MemorySSA. Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default. Reviewers: sanjoy, chandlerc Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56625 llvm-svn: 353339
* [PowerPC] adjust test for uaddo change in rL353001Sanjay Patel2019-02-031-2/+1
| | | | | | | We don't need a mtctr/bctr for this test now; a regular conditional branch is fine. llvm-svn: 353002
* [CGP] adjust target constraints for forming uaddoSanjay Patel2019-02-031-26/+24
| | | | | | | | | | | | | | | | | | | There are 2 changes visible here: 1. There's no reason to limit this transform based on number of condition registers. That diff allows PPC to produce slightly better (dot-instructions should be generally good) code. Note: someone that cares about PPC codegen might want to look closer at that output because it seems like we could still improve this. 2. We (probably?) should not bother trying to form uaddo (or other overflow ops) when there's no target support for such an op. This goes beyond checking whether the op is expanded because both PPC and AArch64 show better codegen for standard types regardless of whether the op is legal/custom. llvm-svn: 353001
* [PowerPC] add tests for saturating add; NFCSanjay Patel2019-02-031-0/+787
| | | | | | This is copied from the existing test files for x86/AArch. llvm-svn: 352987
* [PowerPC] delete no more needed workaround for readsRegister() in PowerPCChen Zheng2019-01-301-0/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D57439 llvm-svn: 352689
* [PowerPC] more opportunity for converting reg+reg to reg+immChen Zheng2019-01-301-0/+17
| | | | | | Differential Revision: https://reviews.llvm.org/D57314 llvm-svn: 352583
* Adjust documentation for git migration.James Y Knight2019-01-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes most references to the paths: llvm.org/svn/ llvm.org/git/ llvm.org/viewvc/ github.com/llvm-mirror/ github.com/llvm-project/ reviews.llvm.org/diffusion/ to instead point to https://github.com/llvm/llvm-project. This is *not* a trivial substitution, because additionally, all the checkout instructions had to be migrated to instruct users on how to use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of checking out various projects into various subdirectories. I've attempted to not change any scripts here, only documentation. The scripts will have to be addressed separately. Additionally, I've deleted one document which appeared to be outdated and unneeded: lldb/docs/building-with-debug-llvm.txt Differential Revision: https://reviews.llvm.org/D57330 llvm-svn: 352514
* [MBP] Don't move bottom block before header if it can't reduce taken branchesGuozhi Wei2019-01-251-2/+2
| | | | | | | | | | | | | | | | | | If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case: -->OldTop<- | . | | . | | . | ---Pred | | | BB----- Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving. Differential Revision: https://reviews.llvm.org/D57067 llvm-svn: 352236
* [PowerPC] Enhance the fast selection of cmp instruction and clean up related ↵Zi Xuan Wu2019-01-251-28/+151
| | | | | | | | | | | | | | | | | asserts Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support. We'd use VSX float comparison instruction instead of non-vsx float comparison instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is opened. If the target does not have corresponding VSX instruction comparison for some type, just copy VSX-related register to common float register class and use non-vsx comparison instruction. Differential Revision: https://reviews.llvm.org/D57078 llvm-svn: 352174
* [PowerPC] Exploit store instructions that store a single vector elementNemanja Ivanovic2019-01-242-102/+248
| | | | | | | | | | | This patch exploits the instructions that store a single element from a vector to preform a (store (extract_elt)). We already have code that does this with ISA 3.0 instructions that were added to handle i8/i16 types. However, we had never exploited the existing ones that handle f32/f64/i32/i64 types. Differential revision: https://reviews.llvm.org/D56175 llvm-svn: 352131
* Remove irrelevant references to legacy git repositories fromJames Y Knight2019-01-151-1/+1
| | | | | | | | | compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200
* Replace "no-frame-pointer-*" function attributes with "frame-pointer"Francis Visoiu Mistrih2019-01-1412-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" to represent three kinds of frame pointer usage: (all) frames use frame pointer, (non-leaf) frames use frame pointer, (none) frame use frame pointer. This CL makes the idea explicit by using only one enum function attribute "frame-pointer" Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as llc. "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still supported for easy migration to "frame-pointer". tests are mostly updated with // replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’ grep -iIrnl '\-disable-fp-elim=false' * | xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g" // replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’ grep -iIrnl '\-disable-fp-elim' * | xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g" Patch by Yuanfang Chen (tabloid.adroit)! Differential Revision: https://reviews.llvm.org/D56351 llvm-svn: 351049
* Recommit "[PowerPC] Fix assert from machine verify pass that unmatched ↵Zi Xuan Wu2019-01-102-9/+9
| | | | | | | | | | register class about fcmp selection in fast-isel" This re-commit r350685. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350799
* Revert "[PowerPC] Fix assert from machine verify pass that unmatched ↵Zi Xuan Wu2019-01-092-9/+9
| | | | | | | | | | register class about fcmp selection in fast-isel" This reverts commit r350685. See compile assert in compiler-rt. llvm-svn: 350693
* [PowerPC] Fix assert from machine verify pass that unmatched register class ↵Zi Xuan Wu2019-01-092-9/+9
| | | | | | | | | | | | | | | | | | | | | about fcmp selection in fast-isel Bad machine code: Illegal virtual register for instruction function: TestULE basic block: %bb.0 entry (0x1000a39b158) instruction: %2:crrc = FCMPUD %1:vsfrc, %3:f8rc operand 1: %1:vsfrc Fix assert about missing match between fcmp instruction and register class. We should use vsx related cmp instruction xvcmpudp instead of fcmpu when vsx is opened. add -verifymachineinstrs option into related test cases to enable the verify pass. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350685
* [Power9] Enable the Out-of-Order scheduling model for P9 hwQingShan Zhang2019-01-0335-4927/+4757
| | | | | | | | | | | | | | | | | | | | | | | | | When switched to the MI scheduler for P9, the hardware is modeled as out of order. However, inside the MI Scheduler algorithm, we still use the in-order scheduling model as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer the op. So, only when all the available instructions issued, the pending instruction could be scheduled. That is not true for our P9 hw in fact. This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017. With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows: x264_r: +6.95% cactuBSSN_r: +6.94% lbm_r: +4.11% xz_r: -3.85% And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved. Reviewer: Nemanjai Differential Revision: https://reviews.llvm.org/D55810 llvm-svn: 350285
* [DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold ↵Craig Topper2019-01-022-6/+6
| | | | | | | | | | | | (sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them. If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead. The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this. Differential Revision: https://reviews.llvm.org/D56156 llvm-svn: 350235
* [PowerPC] Remove SeenUse check when optimizing conditional branch inWei Mi2019-01-021-0/+108
| | | | | | | | | | | | | | | | | | | | | | PPCPreEmitPeephole pass. PPCPreEmitPeephole will convert a BC to B when the conditional branch is based on a constant CR by CRSET or CRUNSET. This is added in https://reviews.llvm.org/rL343100. When the conditional branch is known to be always taken, all branches will be removed and a new unconditional branch will be inserted. However, when SeenUse is false the original patch will not remove the branches, but still insert the new unconditional branch, update the successors and create inconsistent IR. Compiling the synthetic testcase included can show the problem we run into. The patch simply removes the SeenUse condition when adding branches into InstrsToErase set. Differential Revision: https://reviews.llvm.org/D56041 llvm-svn: 350223
* [PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction ↵Kang Zhang2018-12-304-7/+7
| | | | | | | | | | | | | | | | | | that bad machine code Summary: For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo. Then the verifier runs and it seems like we have a use of an undefined register (the register will be reserved later, but the verifier doesn't know that). So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know X2 is a reserved register. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D56148 llvm-svn: 350165
* [PowerPC] Fix ADDE, SUBE do not know how to promote operatorKang Zhang2018-12-301-0/+31
| | | | | | | | | | | | | | Summary: This patch is created to fix the Bugzilla bug 39815: https://bugs.llvm.org/show_bug.cgi?id=39815 This patch is to support promotion integer result for the instruction ADDE, SUBE. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D56119 llvm-svn: 350161
* [PowerPC] Complete the custom legalization of vector int to fp conversionNemanja Ivanovic2018-12-296-4048/+1617
| | | | | | | | | | | | | | | A recent patch has added custom legalization of vector conversions of v2i16 -> v2f64. This just rounds it out for other types where the input vector has an illegal (narrower) type than the result vector. Specifically, this will handle the following conversions: v2i8 -> v2f64 v4i8 -> v4f32 v4i16 -> v4f32 Differential revision: https://reviews.llvm.org/D54663 llvm-svn: 350155
* [PowerPC] Fix CR Bit spill pseudo expansionNemanja Ivanovic2018-12-291-0/+121
| | | | | | | | | | | | | | | | The current CRBIT spill pseudo-op expansion creates a KILL instruction that kills the CRBIT and defines the enclosing CR field. However, this paints a false picture to the register allocator that all bits in the CR field are killed so copies of other bits out of the field become dead and removable. This changes the expansion to preserve the KILL flag on the CRBIT as an implicit use and to treat the CR field as an undef input. Thanks to Hal Finkel for the review and Uli Weigand for implementation input. Differential revision: https://reviews.llvm.org/D55996 llvm-svn: 350153
* [PowerPC] handle ISD:TRUNCATE in BitPermutationSelectorHiroshi Inoue2018-12-281-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last one in a series of patches to support better code generation for bitfield insert. BitPermutationSelector already support ISD::ZERO_EXTEND but not TRUNCATE. This patch adds support for ISD:TRUNCATE in BitPermutationSelector. For example of this test case, struct s64b { int a:4; int b:16; int c:24; }; void bitfieldinsert64b(struct s64b *p, unsigned char v) { p->b = v; } the selection DAG loos like: t14: i32,ch = load<(load 4 from %ir.0)> t0, t2, undef:i64 t18: i32 = and t14, Constant:i32<-1048561> t4: i64,ch = CopyFromReg t0, Register:i64 %1 t22: i64 = AssertZext t4, ValueType:ch:i8 t23: i32 = truncate t22 t16: i32 = shl nuw nsw t23, Constant:i32<4> t19: i32 = or t18, t16 t20: ch = store<(store 4 into %ir.0)> t14:1, t19, t2, undef:i64 By handling truncate in the BitPermutationSelector, we can use information from AssertZext when selecting t19 and skip the mask operation corresponding to t18. So the generated sequences with and without this patch are without this patch rlwinm 5, 5, 0, 28, 11 # corresponding to t18 rlwimi 5, 4, 4, 20, 27 with this patch rlwimi 5, 4, 4, 12, 27 Differential Revision: https://reviews.llvm.org/D49076 llvm-svn: 350118
* [PowerPC] Remove the implicit use of the register if it is replaced by ImmQingShan Zhang2018-12-281-0/+78
| | | | | | | | If we are changing the MI operand from Reg to Imm, we need also handle its implicit use if have. Differential Revision: https://reviews.llvm.org/D56078 llvm-svn: 350115
* [PowerPC] Fix assert from machine verify pass that atomic pseudo expanding ↵Zi Xuan Wu2018-12-284-18/+17
| | | | | | | | | | | | causes mismatched register class For atomic value operand which less than 4 bytes need to be masked. And the related operation to calculate the newvalue can be done in 32 bit gprc. So just use gprc for mask and value calculation. Differential Revision: https://reviews.llvm.org/D56077 llvm-svn: 350113
* [PowerPC] fix register class after converting X-FORM instruction to D-FORM ↵Chen Zheng2018-12-281-2/+2
| | | | | | | | instruction Differential Revision: https://reviews.llvm.org/D55806 llvm-svn: 350111
* [PowerPC] Fix the bug of ISD::ADDE to set its second return type to glueKang Zhang2018-12-251-0/+11
| | | | | | | | | | | | | | | Summary: This patch is to fix the bug imported by rL341634. In above submit , the the return type of ISD::ADDE is 14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), but in fact, the second return type of ISD::ADDE should be MVT::Glue not MVT::i64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D55977 llvm-svn: 350061
* Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing ↵Clement Courbet2018-12-201-8/+16
| | | | | | | | overlapping loads. Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change. llvm-svn: 349747
* [PowerPC] Implement the isSelectSupported() target hookKang Zhang2018-12-201-12/+8
| | | | | | | | | | | | | | | | Summary: PowerPC has scalar selects (isel) and vector mask selects (xxsel). But PowerPC does not have vector CR selects, PowerPC does not support scalar condition selects on vectors. In addition to implementing this hook, isSelectSupported() should return false when the SelectSupportKind is ScalarCondVectorVal, so that predictable selects are converted into branch sequences. Reviewed By: steven.zhang, hfinkel Differential Revision: https://reviews.llvm.org/D55754 llvm-svn: 349727
* Test commitAmy Kwan2018-12-191-0/+1
| | | | llvm-svn: 349633
* [PowerPC]Exploit P9 vabsdu for unsigned vselect patternsKewen Lin2018-12-191-36/+84
| | | | | | | | | | | | For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 llvm-svn: 349599
* [PowerPC][NFC]Update vabsd cases with vselect test casesKewen Lin2018-12-181-0/+194
| | | | | | | | Power9 VABSDU* instructions can be exploited for some special vselect sequences. Check in the orignal test case here, later the exploitation patch will update this and reviewers can check the differences easily. llvm-svn: 349446
* [PowerPC] Exploit power9 new instruction setbKewen Lin2018-12-181-376/+404
| | | | | | | | | | | | | Check the expected pattens feeding to SELECT_CC like: (select_cc lhs, rhs, 1, (sext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, -1, (zext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, 1, -1, cc2), seteq) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, -1, 1, cc2), seteq) Further transform the sequence to comparison + setb if hits. Differential Revision: https://reviews.llvm.org/D53275 llvm-svn: 349445
* [NFC] Add new test to cover the lhs scheduling issue for P9.QingShan Zhang2018-12-181-0/+49
| | | | llvm-svn: 349443
* [NFC] fix test case issue that with wrong label check.QingShan Zhang2018-12-181-7/+14
| | | | llvm-svn: 349439
* [PowerPC] Improve vec_abs on P9Kewen Lin2018-12-182-45/+207
| | | | | | | | | | Improve the current vec_abs support on P9, generate ISD::ABS node for vector types, combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns, do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity. Differential Revision: https://reviews.llvm.org/D54783 llvm-svn: 349437
* [Power9][NFC]update vabsd case for better dumpingKewen Lin2018-12-171-34/+23
| | | | | | | Appended options -ppc-vsr-nums-as-vr and -ppc-asm-full-reg-names to get the more descriptive output. Also removed useless function attributes. llvm-svn: 349329
OpenPOWER on IntegriCloud