bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AArch64] Make assert messages uniform and general [NFC]	Mandeep Singh Grang	2017-06-28	4	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Make assert messages related to Darwin, ELF and COFF uniform. Reviewers: rnk, ruiu, compnerd, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34730 llvm-svn: 306589
*	[AArch64][Falkor] Attempt to fix Windows buildbots	Geoff Berry	2017-06-28	1	-1/+1
\| \| \| \|	llvm-svn: 306588
*	Reuse existing variables. NFC.	Rafael Espindola	2017-06-28	1	-11/+9
\| \| \| \|	llvm-svn: 306586
*	[AArch64][Falkor] Try to avoid exhausting HW prefetcher resources when ↵	Geoff Berry	2017-06-28	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \|	unrolling. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34533 llvm-svn: 306584
*	Reuse existing variable. NFC.	Rafael Espindola	2017-06-28	1	-2/+2
\| \| \| \|	llvm-svn: 306582
*	Fix PR33625.	Rafael Espindola	2017-06-28	1	-1/+1
\| \| \| \| \| \|	We were failing to convert this expression to pcrel. llvm-svn: 306573
*	Don't repeat name in comment and format. NFC.	Rafael Espindola	2017-06-28	1	-19/+15
\| \| \| \|	llvm-svn: 306568
*	Don't repeat names and reformat. NFC.	Rafael Espindola	2017-06-28	1	-46/+37
\| \| \| \|	llvm-svn: 306556
*	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	Geoff Berry	2017-06-28	12	-15/+21
\| \| \| \| \| \| \| \| \| \|	Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554
*	[AArch64] AArch64CondBrTuningPass generates wrong branch instructions	Alexandros Lamprineas	2017-06-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Some conditional branch instructions generated by this pass are checking the wrong condition code. The instructions TBZ and TBNZ are transformed into B.GE and B.LT instead of B.PL and B.MI respectively. They should only be checking the Negative bit. Differential Revision: https://reviews.llvm.org/D34743 llvm-svn: 306550
*	Don't repeat name in comments. 80 columns. NFC.	Rafael Espindola	2017-06-28	1	-22/+16
\| \| \| \|	llvm-svn: 306548
*	[ARM] Improve if-conversion for M-class CPUs without branch predictors	John Brawn	2017-06-28	5	-14/+85
\| \| \| \| \| \| \| \| \| \| \| \| \|	The current heuristic in isProfitableToIfCvt assumes we have a branch predictor, and so gives the wrong answer in some cases when we don't. This patch adds a subtarget feature to indicate that a subtarget has no branch predictor, and changes the heuristic in isProfitableToiIfCvt when it's present. This gives a slight overall improvement in a set of embedded benchmarks on Cortex-M4 and Cortex-M33. Differential Revision: https://reviews.llvm.org/D34398 llvm-svn: 306547
*	[GlobalISel][X86] Support bitwise operations : G_AND, G_OR, G_XOR	Igor Breger	2017-06-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Support G_AND, G_OR, G_XOR for i8/i16/i32/i64. Selection done via TableGen'erated code. Reviewers: zvi, guyblank, aymanmus, m_zuckerman Reviewed By: aymanmus Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34605 llvm-svn: 306533
*	Reverting commit 306414 on behalf of @gadi.haber	Michael Zuckerman	2017-06-28	2	-5225/+1634
\| \| \| \|	llvm-svn: 306532
*	[X86] Correct dwarf unwind information in function epilogue	Petar Jovanovic	2017-06-28	3	-8/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CFI instructions that set appropriate cfa offset and cfa register are now inserted in emitEpilogue() in X86FrameLowering. Majority of the changes in this patch: 1. Ensure that CFI instructions do not affect code generation. 2. Enable maintaining correct information about cfa offset and cfa register in a function when basic blocks are reordered, merged, split, duplicated. These changes are target independent and described below. Changed CFI instructions so that they: 1. are duplicable 2. are not counted as instructions when tail duplicating or tail merging 3. can be compared as equal Add information to each MachineBasicBlock about cfa offset and cfa register that are valid at its entry and exit (incoming and outgoing CFI info). Add support for updating this information when basic blocks are merged, split, duplicated, created. Add a verification pass (CFIInfoVerifier) that checks that outgoing cfa offset and register of predecessor blocks match incoming values of their successors. Incoming and outgoing CFI information is used by a late pass (CFIInstrInserter) that corrects CFA calculation rule for a basic block if needed. That means that additional CFI instructions get inserted at basic block beginning to correct the rule for calculating CFA. Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D18046 llvm-svn: 306529
*	[ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).	Kristof Beyls	2017-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The benchmarking summarized in http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed this is beneficial for a wide range of cores. As is to be expected, quite a few small adaptations are needed to the regressions tests, as the difference in scheduling results in: - Quite a few small instruction schedule differences. - A few changes in register allocation decisions caused by different instruction schedules. - A few changes in IfConversion decisions, due to a difference in instruction schedule and/or the estimated cost of a branch mispredict. llvm-svn: 306514
*	[AMDGPU] Add pattern for v_alignbit_b32 with immediate	Stanislav Mekhanoshin	2017-06-28	2	-3/+6
\| \| \| \| \| \| \| \|	If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 llvm-svn: 306500
*	[COFF, ARM64] Add support for Windows ARM64 COFF format	Mandeep Singh Grang	2017-06-27	14	-5/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the llvm part of the initial implementation to support Windows ARM64 COFF format. I will gradually add more functionality in subsequent patches. Reviewers: ruiu, rnk, t.p.northover, compnerd Reviewed By: ruiu, compnerd Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34705 llvm-svn: 306490
*	[AArch64] Inline callee if its target-features are a subset of the caller	Florian Hahn	2017-06-27	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Similar to X86, it should be safe to inline callees if their target-features are a subset of the caller. This change matches GCC's inlining behavior with respect to attributes [1]. [1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover Reviewed By: t.p.northover Subscribers: aemerson, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34698 llvm-svn: 306478
*	clang-format a file.	Rafael Espindola	2017-06-27	1	-59/+64
\| \| \| \| \| \| \|	It had a few inconsistent indentations that made a followup patch hard to read. llvm-svn: 306474
*	[AArch64] Performance enhancements for Cavium ThunderX2 T99	Joel Jones	2017-06-27	2	-166/+1059
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables significant performance enhancements to the Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6, by adding more detailed scheduling information. Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562 Patch by: steleman Differential Revision: https://reviews.llvm.org/D31801 llvm-svn: 306462
*	[Hexagon] Use proper predicate register state when expanding PS_vselect	Krzysztof Parzyszek	2017-06-27	1	-3/+15
\| \| \| \|	llvm-svn: 306458
*	[AMDGPU] Add 2 new alignbit patterns	Stanislav Mekhanoshin	2017-06-27	1	-0/+9
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34655 llvm-svn: 306449
*	[AMDGPU] Simplify setcc (sext from i1 b), -1\|0, cc	Stanislav Mekhanoshin	2017-06-27	1	-1/+29
\| \| \| \| \| \| \| \| \| \| \|	Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 llvm-svn: 306446
*	[Hexagon] Update kills in hexagon-nvj even more properly than before	Krzysztof Parzyszek	2017-06-27	2	-36/+29
\| \| \| \| \| \| \|	Account for the fact that both, the feeder and the compare can be moved over instructions that kill registers. llvm-svn: 306443
*	[AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0	Stanislav Mekhanoshin	2017-06-27	1	-2/+28
\| \| \| \| \| \| \| \| \| \|	Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 llvm-svn: 306439
*	[X86][AsmParser][MS-compatability] Binary/Unary operators enhancements	Coby Tayree	2017-06-27	1	-37/+75
\| \| \| \| \| \| \| \| \| \| \|	Introducing MOD binary operator https://msdn.microsoft.com/en-us/library/hha180wt.aspx Enhancing unary operators NEG and NOT, to support more complex patterns Differential Revision: https://reviews.llvm.org/D33876 llvm-svn: 306425
*	Updated and extended the information about each instruction in HSW and SNB ↵	Gadi Haber	2017-06-27	2	-1634/+5225
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to include the following data: •static latency •number of uOps from which the instructions consists •all ports used by the instruction Reviewers:  RKSimon zvi aymanmus m_zuckerman Differential Revision: https://reviews.llvm.org/D33897 llvm-svn: 306414
*	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions	Sam Kolton	2017-06-27	6	-33/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 llvm-svn: 306413
*	[AArch64] Update successor probabilities after ccmp-conversion	Matthew Simpson	2017-06-27	1	-4/+44
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies the conditional compares pass so that it keeps successor probabilities up-to-date after the conversion. Previously, successor probabilities were being normalized to a uniform distribution, even though they may have been heavily biased prior to the conversion (e.g., if one of the edges was the back edge of a loop). This loss of information affected passes later in the pipeline. Differential Revision: https://reviews.llvm.org/D34109 llvm-svn: 306412
*	[mips] Add instruction aliases for ds(r\|l)l.	Simon Dardis	2017-06-27	2	-3/+21
\| \| \| \| \| \| \|	Add the instruction aliases for ds(r\|l)l for the two operand alias of ds(r\|l)lv and the aliases ds(r\|l)l with the three register operands. llvm-svn: 306405
*	Recommitting rL305465 after fixing bug in TableGen in rL306251 & rL306371	Ayman Musa	2017-06-27	2	-25/+715
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 306402
*	fix trivial typos, NFC	Hiroshi Inoue	2017-06-27	1	-2/+2
\| \| \| \| \| \|	succesor -> successor llvm-svn: 306393
*	[ARM] GlobalISel: Support G_SELECT for pointers	Diana Picus	2017-06-27	1	-0/+1
\| \| \| \| \| \|	All we need to do is mark it as legal, otherwise it's just like s32. llvm-svn: 306390
*	[globalisel][tablegen] Add support for EXTRACT_SUBREG.	Daniel Sanders	2017-06-27	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388
*	[mips] Refine the condition for when to use CALL16 vs a GOT displacement.	Simon Dardis	2017-06-27	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Borrow from the logic for 'jal' in MipsAsmParser::processInstruction and add the extra condition of bypassing CALL16 if the destination symbol is an ELF symbol with STB_LOCAL binding. Patch by: John Baldwin Reviewers: sdardis Differential Revision: https://reviews.llvm.org/D33999 llvm-svn: 306387
*	[ARM] GlobalISel: Support G_SELECT for i32	Diana Picus	2017-06-27	3	-0/+65
\| \| \| \| \| \| \| \| \| \|	* Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382
*	AMDGPU: M0 operands to spill/restore opcodes are dead	Nicolai Haehnle	2017-06-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 llvm-svn: 306375
*	Fixed the warning introduced by r306289 to make ubuntu-gcc7.1-werror bot green.	Galina Kistanova	2017-06-27	1	-1/+1
\| \| \| \|	llvm-svn: 306369
*	[PowerPC] set optimization level in SelectionDAGISel	Hiroshi Inoue	2017-06-27	3	-6/+8
\| \| \| \| \| \| \| \| \|	PowerPC backend does not pass the current optimization level to SelectionDAGISel and so SelectionDAGISel works with the default optimization level regardless of the current optimization level. This patch makes the PowerPC backend set the optimization level correctly. Differential Revision: https://reviews.llvm.org/D34615 llvm-svn: 306367
*	[AVR] Migrate to new MCAsmBackend applyFixup and processFixupValue	Leslie Zhai	2017-06-27	2	-28/+26
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: rafael, dylanmckay, jroelofs, meadori Reviewed By: rafael, meadori Subscribers: meadori, llvm-commits Differential Revision: https://reviews.llvm.org/D34551 llvm-svn: 306359
*	Fix the bug when handling shufflevector for aarch64.	Dehao Chen	2017-06-26	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600 Reviewers: mssimpso, davidxl, Carrot Reviewed By: mssimpso Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34641 llvm-svn: 306334
*	AArch64: legalize G_EXTRACT operations.	Tim Northover	2017-06-26	2	-4/+18
\| \| \| \| \| \| \|	This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328
*	AArch64: remove all kill flags when extending register liveness.	Tim Northover	2017-06-26	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	When we forward a stored value to a load and eliminate it entirely we need to make sure the liveness of the register is maintained all the way to its use. Previously we only cleared liveness on the store doing the forwarding, but there could be other killing uses in between. We already do the right thing when the load has to be converted into something else, it was just this one path that skipped it. llvm-svn: 306318
*	AMDGPU: Setup SP/FP in callee function prolog/epilog	Matt Arsenault	2017-06-26	3	-2/+78
\| \| \| \|	llvm-svn: 306312
*	[SystemZ] Fix missing emergency spill slot corner case	Ulrich Weigand	2017-06-26	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We sometimes need emergency spill slots for the register scavenger. This may be the case when code needs to access a stack slot that has an offset of 4096 or more relative to the stack pointer. To make that determination, processFunctionBeforeFrameFinalized currently simply checks the total stack frame size of the current function. But this is not enough, since code may need to access stack slots in the caller's stack frame as well, in particular incoming arguments stored on the stack. This commit fixes the problem by taking argument slots into account. llvm-svn: 306305
*	[inline asm] dot operator while using imm generates wrong ir + asm - llvm part	Marina Yatsina	2017-06-26	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inline asm dot operator while using imm generates wrong ir and asm This also fixes bugzilla 32987: https://bugs.llvm.org//show_bug.cgi?id=32987 The clang part of the review that contains the test can be found here: https://reviews.llvm.org/D33040 commit on behald of zizhar Differential Revision: https://reviews.llvm.org/D33039 llvm-svn: 306300
*	[X86][AVX-512] Don't raise inexact in ceil, floor, round, trunc.	Ahmed Bougacha	2017-06-26	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \|	The non-AVX-512 behavior was changed in r248266 to match N1778 (C bindings for IEEE-754 (2008)), which defined the four functions to not raise the inexact exception ("rint" is still defined as raising it). Update the AVX-512 lowering of these functions to match that: it should not be different. llvm-svn: 306299
*	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal	Tom Stellard	2017-06-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 llvm-svn: 306298
*	[x86] transform vector inc/dec to use -1 constant (PR33483)	Sanjay Patel	2017-06-26	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert vector increment or decrement to sub/add with an all-ones constant: add X, <1, 1...> --> sub X, <-1, -1...> sub X, <1, 1...> --> add X, <-1, -1...> The all-ones vector constant can be materialized using a pcmpeq instruction that is commonly recognized as an idiom (has no register dependency), so that's better than loading a splat 1 constant. AVX512 uses 'vpternlogd' for 512-bit vectors because there is apparently no better way to produce 512 one-bits. The general advantages of this lowering are: 1. pcmpeq has lower latency than a memop on every uarch I looked at in Agner's tables, so in theory, this could be better for perf, but... 2. That seems unlikely to affect any OOO implementation, and I can't measure any real perf difference from this transform on Haswell or Jaguar, but... 3. It doesn't look like it from the diffs, but this is an overall size win because we eliminate 16 - 64 constant bytes in the case of a vector load. If we're broadcasting a scalar load (which might itself be a bug), then we're replacing a scalar constant load + broadcast with a single cheap op, so that should always be smaller/better too. 4. This makes the DAG/isel output more consistent - we use pcmpeq already for padd x, -1 and psub x, -1, so we should use that form for +1 too because we can. If there's some reason to favor a constant load on some CPU, let's make the reverse transform for all of these cases (either here in the DAG or in a later machine pass). This should fix: https://bugs.llvm.org/show_bug.cgi?id=33483 Differential Revision: https://reviews.llvm.org/D34336 llvm-svn: 306289