bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Boilerplate for producing XCOFF object files from the PowerPC backend.	Sean Fertile	2019-07-09	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	Stubs out a number of the classes needed to produce a new object file format (XCOFF) for the powerpc-aix target. For testing input is an empty module which produces an object file with just a file header. Differential Revision: https://reviews.llvm.org/D61694 llvm-svn: 365541
*	[NFC][PowerPC] Added a test to show current codegen of MachinePRE	Kai Luo	2019-07-09	1	-0/+55
\| \| \| \|	llvm-svn: 365447
*	[PowerPC][Peephole] Combine extsw and sldi after instruction selection	Kai Luo	2019-07-09	1	-4/+130
\| \| \| \| \| \| \| \| \| \| \|	Summary: `extsw` and `sldi` are supposed to be combined if they are in the same BB in instruction selection phase. This patch handles the case where extsw and sldi are not in the same BB. Differential Revision: https://reviews.llvm.org/D63806 llvm-svn: 365430
*	[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue	Jinsong Ji	2019-07-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later. As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here. Reviewers: bcahoon, hfinkel Reviewed By: hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64035 llvm-svn: 365428
*	[PowerPC][MachinePipeliner][NFC] Add a testcase for Phi bug.	Jinsong Ji	2019-07-09	1	-0/+34
\| \| \| \|	llvm-svn: 365427
*	[PowerPC][NFC]Update testcases using script.	Jinsong Ji	2019-07-08	1	-15/+31
\| \| \| \|	llvm-svn: 365330
*	[NFC][PowerPC] Add the test add_cmp.ll	Kang Zhang	2019-07-08	1	-0/+76
\| \| \| \|	llvm-svn: 365285
*	[PowerPC] Move TOC save to prologue when profitable	Nemanja Ivanovic	2019-07-05	3	-11/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The indirect call sequence on PPC requires that the TOC base register be saved prior to the indirect call and restored after the call since the indirect call may branch to a global entry point in another DSO which will update the TOC base. Over the last couple of years, we have improved this to: - be able to hoist TOC saves from loops (with changes to MachineLICM) - avoid multiple saves when one dominates the other[s] However, it is still possible to have multiple TOC saves dynamically in the execution path if there is no dominance relationship between them. This patch moves the TOC save to the prologue when one of the TOC saves is in a block that post-dominates entry (i.e. it cannot be avoided) or if it is in a block that is hotter than entry. Differential revision: https://reviews.llvm.org/D63803 llvm-svn: 365232
*	[PowerPC] Support constraint code "ww"	Fangrui Song	2019-07-04	2	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: "ww" and "ws" are both constraint codes for VSX vector registers that hold scalar double data. "ww" is preferred for float while "ws" is preferred for double. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64119 llvm-svn: 365106
*	[PowerPC] Hardware Loop branch instruction's condition may not be icmp.	Chen Zheng	2019-07-04	1	-0/+41
\| \| \| \| \| \| \|	This fixes pr42492. Differential Revision: https://reviews.llvm.org/D64124 llvm-svn: 365104
*	[Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)	Roman Lebedev	2019-07-03	2	-445/+446
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010
*	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware ↵	Chen Zheng	2019-07-03	7	-70/+62
\| \| \| \| \| \| \| \| \|	loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993
*	[NFC][Codegen][X86][AArch64][ARM][PowerPC] Recommit: Add test coverage for ↵	Roman Lebedev	2019-07-02	2	-0/+873
\| \| \| \| \| \| \| \| \| \| \|	"add-of-inc" vs "sub-of-not" I initially committed it with --check-prefix instead of --check-prefixes (again, shame on me, and utils/update_*.py not complaining!) and did not have a moment to understand the failure, so i reverted it initially in rL64939. llvm-svn: 364945
*	Revert "[NFC][Codegen][X86][AArch64][ARM][PowerPC] Add test coverage for ↵	Roman Lebedev	2019-07-02	2	-873/+0
\| \| \| \| \| \| \| \| \| \|	"add-of-inc" vs "sub-of-not"" Some test failures i don't have a moment to investigate. This reverts commit r364930. llvm-svn: 364939
*	[NFC][Codegen][X86][AArch64][ARM][PowerPC] Add test coverage for ↵	Roman Lebedev	2019-07-02	2	-0/+873
\| \| \| \| \| \| \| \| \| \|	"add-of-inc" vs "sub-of-not" As it is pointed out in https://reviews.llvm.org/D63992, before we get to pick canonical variant in middle-end we should ensure best codegen in backend. llvm-svn: 364930
*	[PowerPC] Implement the areMemAccessesTriviallyDisjoint hook	QingShan Zhang	2019-07-02	9	-90/+113
\| \| \| \| \| \| \| \| \|	After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise, and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition Differential Revision: https://reviews.llvm.org/D63804 llvm-svn: 364886
*	[UpdateTestChecks][PowerPC] Avoid empty string when scrubbing loop comments	Jinsong Ji	2019-07-01	1	-53/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285 This works for some loops. However, we may generate lines with loop comments only. And since we don't scrub leading white spaces, this will leave an empty line there, and FileCheck will complain it. eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15: error: found empty check string with prefix 'CHECK:' ; CHECK-NEXT: This prevented us from using the `update_llc_test_checks.py` for quite some cases. We should still keep the comment token there, so that we can safely scrub the loop comment without breaking FileCheck. Reviewers: timshen, hfinkel, lebedev.ri, RKSimon Subscribers: nemanjai, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63957 llvm-svn: 364775
*	Default to Secure PLT on PPC for musl libc.	Brad Smith	2019-06-28	1	-0/+2
\| \| \| \| \| \|	This matches the default settings of clang. llvm-svn: 364675
*	[PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others	Jinsong Ji	2019-06-27	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751 llvm-mc aborted when disassembling tabortdc. This patch try to clean up TM related DAGs. * Fixes the problem by remove explicit output of cr0, and put it as implicit def. * Update int_ppc_tbegin pattern to accommodate the implicit def of cr0. * Update the TCHECK operand and int_ppc_tcheck accordingly. * Add some builtin test and disassembly tests. * Remove unused CRRC0/crrc0 Differential Revision: https://reviews.llvm.org/D61935 llvm-svn: 364544
*	Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵	Clement Courbet	2019-06-26	4	-0/+522
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416
*	[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.	Clement Courbet	2019-06-26	4	-522/+0
\| \| \| \| \| \| \| \| \|	This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412
*	Teach the DAGCombine to fold this pattern(c1 and c2 is constant).	QingShan Zhang	2019-06-26	2	-193/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	// fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2) // fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2) // fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2) Sign extend the operands if it is any_extend, to keep the signess of the operands that, the other combine rule would apply. The any_extend is handled as zero extend for constants. i.e. t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0> t2: i64 = any_extend t1 --> t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0> --> t4: i64 = sign_extend_inreg t3 Differential Revision: https://reviews.llvm.org/D63318 llvm-svn: 364382
*	[NFC] Fix buildbot breaks due to r364375	Nemanja Ivanovic	2019-06-26	1	-1/+1
\| \| \| \| \| \| \| \|	For some reason, the update_llc_checks.py script produces checks for empty lines which cause failures. Corrected that to check for actual text produced by llc. llvm-svn: 364377
*	[PowerPC][NFC] Add a TOC save test case prior to posting a related patch	Nemanja Ivanovic	2019-06-26	1	-0/+68
\| \| \| \| \| \| \| \|	An upcoming patch will modify the behaviour with respect to saving the TOC in functions with indirect calls. Adding a test case so the patch will show the difference in codegen. llvm-svn: 364375
*	[PowerPC] Mark FCOPYSIGN legal for FP vectors	Nemanja Ivanovic	2019-06-26	1	-0/+27
\| \| \| \| \| \| \| \| \| \|	This was just an omission in the back end. We have had the instructions for both single and double precision for a few HW generations, but never got around to legalizing these. Differential revision: https://reviews.llvm.org/D63634 llvm-svn: 364373
*	[MachinePipeliner] Fix risky iterator usage R++, --R	Jinsong Ji	2019-06-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator to do reservation. This is risky, as R++, --R may not point to the same element at all. The can cause wrong MII. Differential Revision: https://reviews.llvm.org/D63536 llvm-svn: 364353
*	[PowerPC][NFC]Add a test for MachinePipeliner bug	Jinsong Ji	2019-06-25	1	-0/+36
\| \| \| \|	llvm-svn: 364350
*	[DAGCombine] combineRepeatedFPDivisors - recognize -1.0 / X as a reciprocal	Simon Pilgrim	2019-06-25	1	-0/+32
\| \| \| \| \| \|	Fixes issue identified by @nemanjai (Nemanja Ivanovic) in D62963 / rL363040 - infinite loop due to GetNegatedExpression fighting combineRepeatedFPDivisors resulting in fneg(fdiv(x,splat)) -> fneg(fmul(x,1.0/splat)) -> fmul(x,-1.0/splat) -> fmul(x,(-1.0 * 1.0)/splat) ...... llvm-svn: 364326
*	[PPC32] Support PLT calls for -msecure-plt -fpic	Fangrui Song	2019-06-25	2	-11/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In Secure PLT ABI, -fpic is similar to -fPIC. The differences are that: * -fpic stores the address of _GLOBAL_OFFSET_TABLE_ in r30, while -fPIC stores .got2+0x8000. * -fpic uses an addend of 0 for R_PPC_PLTREL24, while -fPIC uses 0x8000. Reviewers: hfinkel, jhibbits, joerg, nemanjai, spetrovic Reviewed By: jhibbits Subscribers: adalava, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63563 llvm-svn: 364324
*	[PowerPC] Emit XXSEL for vec_sel and code that has the same pattern	Nemanja Ivanovic	2019-06-25	1	-0/+72
\| \| \| \| \| \| \| \| \| \|	As pointed out in https://bugs.llvm.org/show_bug.cgi?id=41777 we do not emit a vector select even when the pretty much asks for one. This patch changes that. Differential revision: https://reviews.llvm.org/D61658 llvm-svn: 364289
*	[CodeGen] Add missing vector type legalization for ctlz_zero_undef	Roland Froese	2019-06-24	1	-13/+76
\| \| \| \| \| \| \| \| \|	Widen vector result type for ctlz_zero_undef and cttz_zero_undef the same as ctlz and cttz. Differential Revision: https://reviews.llvm.org/D63463 llvm-svn: 364221
*	[PowerPC][UpdateTestChecks] powerpc- triple support	Jinsong Ji	2019-06-24	1	-30/+33
\| \| \| \| \| \| \| \| \|	There are quite some old testcases with powerpc- triple, we should add this triple support so that we can update them with script. Differential Revision: https://reviews.llvm.org/D63723 llvm-svn: 364213
*	Rename ExpandISelPseudo->FinalizeISel, delay register reservation	Matt Arsenault	2019-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
*	[SelectionDAG] Legalize vaargs that require vector splitting	Simon Pilgrim	2019-06-18	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	This adds vector splitting for vaarg instructions during type legalization Committed on behalf of @luke (Luke Lau) Differential Revision: https://reviews.llvm.org/D60762 llvm-svn: 363671
*	[lit] Delete empty lines at the end of lit.local.cfg NFC	Fangrui Song	2019-06-17	1	-1/+0
\| \| \| \|	llvm-svn: 363538
*	Describe stack-id as an enum	Sander de Smalen	2019-06-17	5	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533
*	PowerPC: Optimize SPE double parameter calling setup	Justin Hibbits	2019-06-17	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SPE passes doubles the same as soft-float, in register pairs as i32 types. This is all handled by the target-independent layer. However, this is not optimal when splitting or reforming the doubles, as it pushes to the stack and loads from, on either side. For instance, to pass a double argument to a function, assuming the double value is in r5, the sequence currently looks like this: evstdd 5, X(1) lwz 3, X(1) lwz 4, X+4(1) Likewise, to form a double into r5 from args in r3 and r4: stw 3, X(1) stw 4, X+4(1) evldd 5, X(1) This optimizes the fence to use SPE instructions. Now, to pass a double to a function: mr 4, 5 evmergehi 3, 5, 5 And to form a double into r5 from args in r3 and r4: evmergelo 5, 3, 4 This is comparable to the way that gcc generates the double splits. This also fixes a bug with expanding builtins to libcalls, where the LowerCallTo() code path was generating intermediate illegal type nodes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54583 llvm-svn: 363526
*	[PowerPC] Set the innermost hot loop to align 32 bytes	Kang Zhang	2019-06-15	1	-0/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that we can decrease cache misses and branch-prediction misses. Actual alignment of the loop will depend on the hotness check and other logic in alignBlocks. The old code will only align hot loop to 32 bytes when the LoopSize larger than 16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop to 32 bytes not only for the hot loop whose size is 16~32 bytes. Reviewed By: steven.zhang, jsji Differential Revision: https://reviews.llvm.org/D61228 llvm-svn: 363495
*	[MBP] Move a latch block with conditional exit and multi predecessors to top ↵	Guozhi Wei	2019-06-14	6	-182/+159
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of loop Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is: * a latch block * it has two successors, one is loop header, another is exit * it has more than one predecessors If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch. Differential Revision: https://reviews.llvm.org/D43256 llvm-svn: 363471
*	[MachinePiepliner] Don't check boundary node in checkValidNodeOrder	Jinsong Ji	2019-06-13	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def. When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access. Differential Revision: https://reviews.llvm.org/D63282 llvm-svn: 363329
*	[FIX] Forces shrink wrapping to consider any memory access as aliasing with ↵	Diogo N. Sampaio	2019-06-13	5	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the stack Summary: Relate bug: https://bugs.llvm.org/show_bug.cgi?id=37472 The shrink wrapping pass prematurally restores the stack, at a point where the stack might still be accessed. Taking an exception can cause the stack to be corrupted. As a first approach, this patch is overly conservative, assuming that any instruction that may load or store could access the stack. Reviewers: dmgreen, qcolombet Reviewed By: qcolombet Subscribers: simpal01, efriedma, eli.friedman, javed.absar, llvm-commits, eugenis, chill, carwil, thegameg Tags: #llvm Differential Revision: https://reviews.llvm.org/D63152 llvm-svn: 363265
*	[PowerPC][NFC] Added test for sext/shl combination after isel.	Kai Luo	2019-06-12	1	-0/+76
\| \| \| \|	llvm-svn: 363118
*	[PowerPC][NFC]Remove sms-simple.ll test temporarily.	Jinsong Ji	2019-06-11	1	-78/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looks like a MachinePipeliner algorithm problem found by sanitizer-x86_64-linux-fast. I will backout this test first while investigating the problem to unblock buildbot. ==49637==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x614000002e08 at pc 0x000004364350 bp 0x7ffe228a3bd0 sp 0x7ffe228a3bc8 READ of size 4 at 0x614000002e08 thread T0 #0 0x436434f in llvm::SwingSchedulerDAG::checkValidNodeOrder(llvm::SmallVector<llvm::NodeSet, 8u> const&) const /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:3736:11 #1 0x4342cd0 in llvm::SwingSchedulerDAG::schedule() /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:486:3 #2 0x434042d in llvm::MachinePipeliner::swingModuloScheduler(llvm::MachineLoop&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:385:7 #3 0x433eb90 in llvm::MachinePipeliner::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:207:5 #4 0x428b7ea in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #5 0x4d1a913 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #6 0x4d1b192 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #7 0x4d1c06d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1752:27 #8 0x4d1c06d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1865 #9 0xa48ca3 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:611:8 #10 0xa4270f in main /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:365:22 #11 0x7fec902572e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #12 0x971b69 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x971b69) llvm-svn: 363105
*	[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner	Jinsong Ji	2019-06-11	1	-0/+78
\| \| \| \| \| \| \| \| \|	Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9. Differential Revision: https://reviews.llvm.org/D62164 llvm-svn: 363085
*	[PowerPC][HTM]Fix $zero is not a GPRC register for builtin_ttest	Jinsong Ji	2019-06-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was found during HTM cleanup. Adding a test for builtin_ttest would expose following issue. * Bad machine code: Illegal physical register for instruction * - function: test10 - basic block: %bb.0 entry (0xf0e57497b58) - instruction: %5:crrc0 = TABORTWCI 0, $zero, 0 - operand 2: $zero $zero is not a GPRC register. LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D63079 llvm-svn: 362974
*	[DAGCombine] Match a pattern where a wide type scalar value is stored by ↵	QingShan Zhang	2019-06-10	1	-266/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res \|= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > ((i32)p) = val; i8 p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > ((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D62897 llvm-svn: 362921
*	[NFC] Test if commit access granted.	Kai Luo	2019-06-10	1	-0/+1
\| \| \| \|	llvm-svn: 362917
*	[MachineScheduler] checkResourceLimit boundary condition update	Jinsong Ji	2019-06-07	5	-22/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we call checkResourceLimit in bumpCycle or bumpNode, and we know the resource count has just reached the limit (the equations are equal). We should return true to mark that we are resource limited for next schedule, or else we might continue to schedule in favor of latency for 1 more schedule and create a schedule that actually overbook the resource. When we call checkResourceLimit to estimate the resource limite before scheduling, we don't need to return true even if the equations are equal, as it shouldn't limit the schedule for it . Differential Revision: https://reviews.llvm.org/D62345 llvm-svn: 362805
*	[CodeGen] Generic Hardware Loop Support	Sam Parker	2019-06-07	2	-17/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774
*	[PowerPC] Exploit the vector min/max instructions	Nemanja Ivanovic	2019-06-06	4	-422/+386
\| \| \| \| \| \| \| \| \| \|	Use the PPC vector min/max instructions for computing the corresponding operation as these should be faster than the compare/select sequences we currently emit. Differential revision: https://reviews.llvm.org/D47332 llvm-svn: 362759