bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PowerPC] Set the innermost hot loop to align 32 bytes	Kang Zhang	2019-06-15	1	-0/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that we can decrease cache misses and branch-prediction misses. Actual alignment of the loop will depend on the hotness check and other logic in alignBlocks. The old code will only align hot loop to 32 bytes when the LoopSize larger than 16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop to 32 bytes not only for the hot loop whose size is 16~32 bytes. Reviewed By: steven.zhang, jsji Differential Revision: https://reviews.llvm.org/D61228 llvm-svn: 363495
*	[MBP] Move a latch block with conditional exit and multi predecessors to top ↵	Guozhi Wei	2019-06-14	6	-182/+159
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of loop Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is: * a latch block * it has two successors, one is loop header, another is exit * it has more than one predecessors If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch. Differential Revision: https://reviews.llvm.org/D43256 llvm-svn: 363471
*	[MachinePiepliner] Don't check boundary node in checkValidNodeOrder	Jinsong Ji	2019-06-13	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def. When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access. Differential Revision: https://reviews.llvm.org/D63282 llvm-svn: 363329
*	[FIX] Forces shrink wrapping to consider any memory access as aliasing with ↵	Diogo N. Sampaio	2019-06-13	5	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the stack Summary: Relate bug: https://bugs.llvm.org/show_bug.cgi?id=37472 The shrink wrapping pass prematurally restores the stack, at a point where the stack might still be accessed. Taking an exception can cause the stack to be corrupted. As a first approach, this patch is overly conservative, assuming that any instruction that may load or store could access the stack. Reviewers: dmgreen, qcolombet Reviewed By: qcolombet Subscribers: simpal01, efriedma, eli.friedman, javed.absar, llvm-commits, eugenis, chill, carwil, thegameg Tags: #llvm Differential Revision: https://reviews.llvm.org/D63152 llvm-svn: 363265
*	[PowerPC][NFC] Added test for sext/shl combination after isel.	Kai Luo	2019-06-12	1	-0/+76
\| \| \| \|	llvm-svn: 363118
*	[PowerPC][NFC]Remove sms-simple.ll test temporarily.	Jinsong Ji	2019-06-11	1	-78/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looks like a MachinePipeliner algorithm problem found by sanitizer-x86_64-linux-fast. I will backout this test first while investigating the problem to unblock buildbot. ==49637==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x614000002e08 at pc 0x000004364350 bp 0x7ffe228a3bd0 sp 0x7ffe228a3bc8 READ of size 4 at 0x614000002e08 thread T0 #0 0x436434f in llvm::SwingSchedulerDAG::checkValidNodeOrder(llvm::SmallVector<llvm::NodeSet, 8u> const&) const /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:3736:11 #1 0x4342cd0 in llvm::SwingSchedulerDAG::schedule() /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:486:3 #2 0x434042d in llvm::MachinePipeliner::swingModuloScheduler(llvm::MachineLoop&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:385:7 #3 0x433eb90 in llvm::MachinePipeliner::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:207:5 #4 0x428b7ea in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #5 0x4d1a913 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #6 0x4d1b192 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #7 0x4d1c06d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1752:27 #8 0x4d1c06d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1865 #9 0xa48ca3 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:611:8 #10 0xa4270f in main /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:365:22 #11 0x7fec902572e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #12 0x971b69 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x971b69) llvm-svn: 363105
*	[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner	Jinsong Ji	2019-06-11	1	-0/+78
\| \| \| \| \| \| \| \| \|	Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9. Differential Revision: https://reviews.llvm.org/D62164 llvm-svn: 363085
*	[PowerPC][HTM]Fix $zero is not a GPRC register for builtin_ttest	Jinsong Ji	2019-06-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was found during HTM cleanup. Adding a test for builtin_ttest would expose following issue. * Bad machine code: Illegal physical register for instruction * - function: test10 - basic block: %bb.0 entry (0xf0e57497b58) - instruction: %5:crrc0 = TABORTWCI 0, $zero, 0 - operand 2: $zero $zero is not a GPRC register. LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D63079 llvm-svn: 362974
*	[DAGCombine] Match a pattern where a wide type scalar value is stored by ↵	QingShan Zhang	2019-06-10	1	-266/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res \|= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > ((i32)p) = val; i8 p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > ((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D62897 llvm-svn: 362921
*	[NFC] Test if commit access granted.	Kai Luo	2019-06-10	1	-0/+1
\| \| \| \|	llvm-svn: 362917
*	[MachineScheduler] checkResourceLimit boundary condition update	Jinsong Ji	2019-06-07	5	-22/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we call checkResourceLimit in bumpCycle or bumpNode, and we know the resource count has just reached the limit (the equations are equal). We should return true to mark that we are resource limited for next schedule, or else we might continue to schedule in favor of latency for 1 more schedule and create a schedule that actually overbook the resource. When we call checkResourceLimit to estimate the resource limite before scheduling, we don't need to return true even if the equations are equal, as it shouldn't limit the schedule for it . Differential Revision: https://reviews.llvm.org/D62345 llvm-svn: 362805
*	[CodeGen] Generic Hardware Loop Support	Sam Parker	2019-06-07	2	-17/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774
*	[PowerPC] Exploit the vector min/max instructions	Nemanja Ivanovic	2019-06-06	4	-422/+386
\| \| \| \| \| \| \| \| \| \|	Use the PPC vector min/max instructions for computing the corresponding operation as these should be faster than the compare/select sequences we currently emit. Differential revision: https://reviews.llvm.org/D47332 llvm-svn: 362759
*	[AIX] Implement function descriptor on SDAG	Jason Liu	2019-06-06	2	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 llvm-svn: 362735
*	[AIX] Implement call lowering with parameters could pass onto GPRs	Jason Liu	2019-06-06	2	-4/+203
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch implements SDAG call lowering on AIX for functions which only have parameters that could fit into GPRs. Reviewers: hubert.reinterpretcast, syzaara Differential Revision: https://reviews.llvm.org/D62823 llvm-svn: 362708
*	[PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible	Nemanja Ivanovic	2019-06-05	2	-0/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally speaking, we lower to an optimal rotate sequence for nodes visible in the SDAG. However, there are instances where the two rotates are not visible at ISEL time - most notably those in a very common sequence when lowering switch statements to jump tables. A common situation is a switch on a 32-bit integer. This value has to have the upper 32 bits cleared and because jump table offsets are word offsets, the value needs to be shifted left by 2 bits. We currently emit the clear and the left shift as two separate instructions, but this is not needed as we can lower it to a single RLDIC. This patch just cleans that up. Differential revision: https://reviews.llvm.org/D60402 llvm-svn: 362576
*	[PowerPC][NFC] Add codegen test for consecutive stores of vector elements	Nemanja Ivanovic	2019-06-05	1	-0/+535
\| \| \| \| \| \| \| \| \|	NFC commit of a test case in order for the subsequent review to show differences in codegen. Differential revision: https://reviews.llvm.org/D62843 llvm-svn: 362573
*	Revert r362472 as it is breaking PPC build bots	Nemanja Ivanovic	2019-06-04	1	-49/+266
\| \| \| \| \| \| \|	The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots. Reverting it to bring the bots back to green. llvm-svn: 362539
*	[NFC][Codegen][PowerPC] Autogenerate shift-cmp.ll test	Roman Lebedev	2019-06-04	1	-23/+23
\| \| \| \| \| \|	Being affected by upcoming patch llvm-svn: 362529
*	[PowerPC] P9 Scheduling Model: dispatching rule fixes	Jinsong Ji	2019-06-04	3	-204/+204
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to address some of the problems in existing P9 resource modeling, especially about the dispatching rules. Instead of using a hypothetical DISPATCHER , we try to use the number of actual dispatch slots, and define SchedWriteRes to model dispatch rules, then update instruction classes according to dispatch rules. All the dispatch rules and instruction classes update are made according to POWER9 User Manual. Differential Revision: https://reviews.llvm.org/D61873 llvm-svn: 362509
*	[DAGCombine] Match a pattern where a wide type scalar value is stored by ↵	QingShan Zhang	2019-06-04	1	-266/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res \|= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > ((i32)p) = val; i8 p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > ((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D61843 llvm-svn: 362472
*	[PowerPC] add testcases for reordering LSR and PPCCTRLoops - NFC	Chen Zheng	2019-06-04	1	-0/+217
\| \| \| \|	llvm-svn: 362468
*	Propagate fmf for setcc/select folds	Michael Berg	2019-06-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG. Reviewers: qcolombet, spatel Reviewed By: qcolombet Subscribers: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D62552 llvm-svn: 362439
*	[PowerPC] Look through copies for compare elimination	Nemanja Ivanovic	2019-06-03	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \|	We currently miss the opportunities for optmizing comparisons in the peephole optimizer if the input is the result of a COPY since we look for record-form versions of the producing instruction. This patch simply lets the optimization peek through copies. Differential revision: https://reviews.llvm.org/D59633 llvm-svn: 362438
*	[PPC] Correctly adjust branch probability in PPCReduceCRLogicals	Guozhi Wei	2019-05-31	2	-1/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PPCReduceCRLogicals after splitting the original MBB into 2, the 2 impacted branches still use original branch probability. This is unreasonable. Suppose we have following code, and the probability of each successor is 50%. condc = conda \|\| condb br condc, label %target, label %fallthrough It can be transformed to following, br conda, label %target, label %newbb newbb: br condb, label %target, label %fallthrough Since each branch has a probability of 50% to each successor, the total probability to %fallthrough is 25% now, and the total probability to %target is 75%. This actually changed the original profiling data. A more reasonable probability can be set to 70% to the false side for each branch instruction, so the total probability to %fallthrough is close to 50%. This patch assumes the branch target with two incoming edges have same edge frequency and computes new probability fore each target, and keep the total probability to original targets unchanged. Differential Revision: https://reviews.llvm.org/D62430 llvm-svn: 362237
*	Partial revert of revert of r361827: Add constrained intrinsic tests for ↵	Kevin P. Neal	2019-05-29	1	-0/+7528
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	powerpc64le. The powerpc64-"nonle" tests are removed. They fail because of a bug that Drew is currently working on that affects multiple targets. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Hal Finkel, Kevin P. Neal Approved by: Hal Finkel Differential Revision: http://reviews.llvm.org/D62388 llvm-svn: 361985
*	[CodeGen] Add lrint/llrint builtins	Adhemerval Zanella	2019-05-28	2	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D62017 llvm-svn: 361875
*	Revert 361827. It broke the bots.	Kevin P. Neal	2019-05-28	1	-10810/+0
\| \| \| \|	llvm-svn: 361831
*	Add constrained intrinsic tests for powerpc64 and powerpc64le.	Kevin P. Neal	2019-05-28	1	-0/+10810
\| \| \| \| \| \| \| \| \|	Submitted by: Drew Wock Reviewed by: Hal Finkel Approved by: Hal Finkel Differential Revision: https://reviews.llvm.org/D62388 llvm-svn: 361827
*	[SelectionDAG] soften assertion when legalizing narrow vector FP ops	Sanjay Patel	2019-05-25	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	The test based on PR42010: https://bugs.llvm.org/show_bug.cgi?id=42010 ...may show an inaccuracy for PPC's target defs, but we should not be so aggressive with an assert here. There's no telling what out-of-tree targets look like. llvm-svn: 361696
*	Implement call lowering without parameters on AIX	Jason Liu	2019-05-24	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \|	Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 llvm-svn: 361669
*	[PowerPC] Remove CRBits Copy Of Unset/set CBit	Stefan Pintilie	2019-05-24	2	-4/+188
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the situation, where we generate the following code: crxor 8, 8, 8 < Some instructions> .LBB0_1: < Some instructions> cror 1, 8, 8 cror (COPY of CRbit) depends on the result of the crxor instruction. CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on any previous instruction. This patch will optimize it to: < Some instructions> .LBB0_1: < Some instructions> cror 1, 1, 1 Patch By: Victor Huang (NeHuang) Differential Revision: https://reviews.llvm.org/D62044 llvm-svn: 361632
*	[Power9] Add a specific heuristic to schedule the addi before the load	QingShan Zhang	2019-05-24	1	-1/+18
\| \| \| \| \| \| \| \| \| \|	When we are scheduling the load and addi, if all other heuristic didn't take effect, we will try to schedule the addi before the load, to hide the latency, and avoid the true dependency added by RA. And this only take effects for Power9. Differential Revision: https://reviews.llvm.org/D61930 llvm-svn: 361600
*	UpdateTestChecks: ppc32 triple support	Roman Lebedev	2019-05-23	1	-51/+241
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Appears identical to powerpc64{,le}. Regenerate test that is being affected by upcoming patch. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: nemanjai, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62339 llvm-svn: 361543
*	[NFC][PPC] Autogenerate vec_add_sub_quadword.ll test	Roman Lebedev	2019-05-23	1	-90/+140
\| \| \| \| \| \|	Being affected by (sub %x, C) -> add %X, (sub 0, C) 'for vectors' patch. llvm-svn: 361525
*	[NFC][PPC] Autogenerate vec_add_sub_doubleword.ll test	Roman Lebedev	2019-05-23	1	-42/+98
\| \| \| \| \| \|	Being affected by (sub %x, C) -> add %X, (sub 0, C) 'for vectors' patch. llvm-svn: 361524
*	[TargetLowering] Extend bool args to inline-asm according to getBooleanType	Kees Cook	2019-05-22	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This extends Krzysztof Parzyszek's X86-specific solution (https://reviews.llvm.org/D60208) to the generic code pointed out by James Y Knight. Reviewers: kparzysz, craig.topper, nickdesaulniers Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight Tags: #llvm Differential Revision: https://reviews.llvm.org/D60224 llvm-svn: 361404
*	[PPC64] Parse -elfv1 -elfv2 when specified on target triple	Fangrui Song	2019-05-22	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For big-endian powerpc64, the default ABI is ELFv1. OpenPower ABI ELFv2 is supported when -mabi=elfv2 is specified. FreeBSD support for PowerPC64 ELFv2 ABI with LLVM is in progress[1]. This patch adds an alternative way to specify ELFv2 ABI on target triple [2]. The following results are expected: ELFv1 when using: -target powerpc64-unknown-freebsd12.0 -target powerpc64-unknown-freebsd12.0 -mabi=elfv1 -target powerpc64-unknown-freebsd12.0-elfv1 ELFv2 when using: -target powerpc64-unknown-freebsd12.0 -mabi=elfv2 -target powerpc64-unknown-freebsd12.0-elfv2 [1] https://wiki.freebsd.org/powerpc/llvm-elfv2 [2] https://clang.llvm.org/docs/CrossCompilation.html Patch by Alfredo Dal'Ava Júnior! Differential Revision: https://reviews.llvm.org/D61950 llvm-svn: 361355
*	[PowerPC] [ISEL] select x-form instruction for unaligned offset	Chen Zheng	2019-05-22	2	-25/+26
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62173 llvm-svn: 361346
*	Move csr-save-restore-order.ll to the right place	Yi-Hong Lyu	2019-05-21	1	-0/+168
\| \| \| \|	llvm-svn: 361306
*	[NFC][PowerPC] Add a test to verify if the scheduler schedule the addi ↵	QingShan Zhang	2019-05-21	1	-0/+106
\| \| \| \| \| \|	before the load. llvm-svn: 361221
*	[PowerPC] test cases for selecting x-form instruction for unaligned offset - ↵	Chen Zheng	2019-05-21	1	-0/+113
\| \| \| \| \| \|	NFC llvm-svn: 361219
*	Re-land r360859: "[MergeICmps] Simplify the code."	Clement Courbet	2019-05-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	With a fix for PR41917: The predecessor list was changing under our feet. - for (BasicBlock Pred : predecessors(EntryBlock_)) { + while (!pred_empty(EntryBlock_)) { + BasicBlock const Pred = *pred_begin(EntryBlock_); llvm-svn: 361009
*	Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code.""	Nico Weber	2019-05-17	1	-1/+1
\| \| \| \| \| \|	It caused PR41917. llvm-svn: 360963
*	[CodeGen] Add lround/llround builtins	Adhemerval Zanella	2019-05-16	2	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch add the ISD::LROUND and ISD::LLROUND along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lround/llround generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. llvm-svn: 360889
*	RegAllocFast: Improve hinting heuristic	Matt Arsenault	2019-05-16	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trace through multiple COPYs when looking for a physreg source. Add hinting for vregs that will be copied into physregs (we only hinted for vregs getting copied to a physreg previously). Give hinted a register a bonus when deciding which value to spill. This is part of my rewrite regallocfast series. In fact this one doesn't even have an effect unless you also flip the allocation to happen from back to front of a basic block. Nonetheless it helps to split this up to ease review of D52010 Patch by Matthias Braun llvm-svn: 360887
*	Reland r360771 "[MergeICmps] Simplify the code."	Clement Courbet	2019-05-16	1	-1/+1
\| \| \| \| \| \|	This revision does not seem to be the culprit. llvm-svn: 360859
*	RegAlloc: try to fail more gracefully when out of registers	Nicolai Haehnle	2019-05-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The emitError path allows the program to continue, unlike report_fatal_error. This is friendlier to use cases where LLVM is embedded in a larger program, because the caller may be able to deal with the error somewhat gracefully. Change the number of requested NOP bytes in the AArch64 and PowerPC test cases to avoid triggering an unrelated assertion. The compilation still fails, as verified by the test. Change-Id: Iafb9ca341002a597b82e59ddc7a1f13c78758e3d Reviewers: arsenm, MatzeB Subscribers: qcolombet, nemanjai, wdng, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61489 llvm-svn: 360786
*	Revert r360771 "[MergeICmps] Simplify the code."	Clement Courbet	2019-05-15	1	-1/+1
\| \| \| \| \| \|	Breaks a bunch of builbdots. llvm-svn: 360776
*	[MergeICmps] Simplify the code.	Clement Courbet	2019-05-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of patching the original blocks, we now generate new blocks and delete the old blocks. This results in simpler code with a less twisted control flow (see the change in `entry-block-shuffled.ll`). This will make https://reviews.llvm.org/D60318 simpler by making it more obvious where control flow created and deleted. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61736 llvm-svn: 360771