bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	livePhysRegs: Pass MBB by reference in addLive{Ins\|Outs}(); NFC	Matthias Braun	2016-05-03	8	-10/+10
\| \| \| \| \| \| \|	The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340
*	LivePhysRegs: Automatically determine presence of pristine regs.	Matthias Braun	2016-05-03	6	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336
*	[X86] Model FAULTING_LOAD_OP as a terminator and branch.	Quentin Colombet	2016-05-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
*	[X86][SSE] Added placeholder for 128/256-bit wide shuffle combines	Simon Pilgrim	2016-05-02	1	-6/+14
\| \| \| \| \| \|	Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles llvm-svn: 268306
*	AMDGPU: Custom lower v2i32 loads and stores	Matt Arsenault	2016-05-02	1	-7/+39
\| \| \| \| \| \| \|	This will allow us to split up 64-bit private accesses when necessary. llvm-svn: 268296
*	AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch	Tom Stellard	2016-05-02	1	-2/+1
\| \| \| \| \| \| \| \| \|	We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 llvm-svn: 268295
*	AMDGPU: Make i64 loads/stores promote to v2i32	Matt Arsenault	2016-05-02	2	-55/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293
*	Fix instance of -Winconsistent-missing-override in AMDGPU code	Reid Kleckner	2016-05-02	1	-1/+1
\| \| \| \|	llvm-svn: 268289
*	AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratch	Tom Stellard	2016-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 llvm-svn: 268287
*	ARM: fix handling of SUB immediates in peephole opt.	Tim Northover	2016-05-02	1	-12/+30
\| \| \| \| \| \| \| \| \| \| \|	We were negating an immediate that was going to be used in a SUBri form unnecessarily. Since ADD/SUB are very similar we can do that, but we have to change the SUB to an ADD at the same time. This also applies to ADD, and allows us to handle a slightly larger range of immediates for those two operations. rdar://25992245 llvm-svn: 268276
*	[NVPTX] Fix sign/zero-extending ldg/ldu instruction selection	Justin Holewinski	2016-05-02	3	-48/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We don't have sign-/zero-extending ldg/ldu instructions defined, so we need to emulate them with explicit CVTs. We were originally handling the i8 case, but not any other cases. Fixes PR26185 Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D19615 llvm-svn: 268272
*	AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cpp	Tom Stellard	2016-05-02	3	-39/+51
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: jvesely, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19736 llvm-svn: 268267
*	AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260	Tom Stellard	2016-05-02	1	-1/+2
\| \| \| \| \| \| \|	We can't use MI->getDebugLoc() when MI is an iterator that could be MBB.end(). llvm-svn: 268265
*	AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses	Tom Stellard	2016-05-02	3	-4/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for detecting hazards in SMEM soft clauses, so that we only break the clauses when necessary, either by adding s_nop or re-ordering other alu instructions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18870 llvm-svn: 268260
*	AMDGPU: llvm.SI.fs.constant is a source of divergence	Nicolai Haehnle	2016-05-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259
*	[WebAssembly] Rename memory_size intrinsic to current_memory	Derek Schuff	2016-05-02	1	-9/+9
\| \| \| \| \| \|	This follows the recent renaming in the wasm spec. llvm-svn: 268255
*	AMDGPU/SI: Use hazard recognizer to detect DPP hazards	Tom Stellard	2016-05-02	3	-55/+27
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 llvm-svn: 268247
*	[X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN ↵	Simon Pilgrim	2016-05-02	4	-37/+12
\| \| \| \| \| \| \| \|	lowering movmsk.ll tests are unchanged. llvm-svn: 268237
*	Cleanup comments. NFC.	Chad Rosier	2016-05-02	2	-3/+4
\| \| \| \|	llvm-svn: 268236
*	Cleanup comments. NFC.	Chad Rosier	2016-05-02	1	-4/+3
\| \| \| \|	llvm-svn: 268235
*	Silence unused variable warnings; NFC.	Aaron Ballman	2016-05-02	1	-9/+4
\| \| \| \|	llvm-svn: 268234
*	Enable the X86 call frame optimization for the 64-bit targets that allow it.	David L Kreitzer	2016-05-02	2	-16/+36
\| \| \| \| \| \| \| \|	Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227
*	[SystemZ] Fix in restoreCalleeSavedRegisters()	Jonas Paulsson	2016-05-02	1	-1/+2
\| \| \| \| \| \| \| \|	Only add operands for GRs to the LMG. Reviewed by Ulrich Weigand. llvm-svn: 268216
*	[SystemZ] Mark CC defs as dead whenever possible.	Jonas Paulsson	2016-05-02	3	-5/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Marking implicit CC defs as dead everywhere except when CC is actually defined and used explicitly, is important since the post-ra scheduler will otherwise insert edges between instructions unnecessarily. Also temporarily disable LA(Y)-> AGSI optimization in foldMemoryOperandImpl(), since this inroduces a def of the CC reg, which is illegal unless it is known to be dead. Reviewed by Ulrich Weigand. llvm-svn: 268215
*	[X86] Fix a bug in LOCK arithmetic operation pattern matching where the ↵	Craig Topper	2016-05-02	1	-1/+1
\| \| \| \| \| \| \| \|	wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212
*	[AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While ↵	Craig Topper	2016-05-01	1	-4/+4
\| \| \| \| \| \|	there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200
*	Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵	Igor Breger	2016-05-01	3	-110/+98
\| \| \| \| \| \| \| \|	implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
*	[AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when ↵	Craig Topper	2016-05-01	1	-3/+3
\| \| \| \| \| \|	VLX and BWI are supported. llvm-svn: 268189
*	[AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB ↵	Craig Topper	2016-05-01	1	-13/+14
\| \| \| \| \| \|	and VPMADDUBSW/VPMADDWD. llvm-svn: 268188
*	[AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also ↵	Craig Topper	2016-05-01	1	-16/+16
\| \| \| \| \| \|	marked as requiring VLX. llvm-svn: 268186
*	[X86] Add an AddedComplexity to another pattern to put it near similar in ↵	Craig Topper	2016-05-01	1	-2/+1
\| \| \| \| \| \|	the output file. llvm-svn: 268184
*	[X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere ↵	Craig Topper	2016-05-01	1	-2/+0
\| \| \| \| \| \|	with an AddedComplexity that made this unreachable. llvm-svn: 268183
*	[X86] Add AddedComplexity to keep some similar patterns near each other in ↵	Craig Topper	2016-05-01	1	-0/+1
\| \| \| \| \| \|	the output file. llvm-svn: 268181
*	[X86] Remove some redundant selection patterns.	Craig Topper	2016-05-01	2	-11/+0
\| \| \| \|	llvm-svn: 268180
*	[AVX512] Replace vector_extract with extractelt in some patterns. They mean ↵	Craig Topper	2016-05-01	1	-5/+5
\| \| \| \| \| \|	the same thing but vector_extract is deprecated. NFC llvm-svn: 268179
*	[AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.	Craig Topper	2016-05-01	1	-4/+7
\| \| \| \|	llvm-svn: 268174
*	[X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.	Craig Topper	2016-04-30	2	-13/+9
\| \| \| \|	llvm-svn: 268164
*	Add missing override.	Rafael Espindola	2016-04-30	1	-1/+2
\| \| \| \|	llvm-svn: 268163
*	AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits	Tom Stellard	2016-04-30	1	-6/+0
\| \| \| \| \| \|	This was supposed to be part of r268143. llvm-svn: 268154
*	[PowerPC/QPX] Fix the load/splat peephole with overlapping reads	Hal Finkel	2016-04-30	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	If, in between the splat and the load (which does an implicit splat), there is a read of the splat register, then that register must have another earlier definition. In that case, we can't replace the load's destination register with the splat's destination register. Unfortunately, I don't have a small or non-fragile test case. llvm-svn: 268152
*	AMDGPU/SI: Enable the post-ra scheduler	Tom Stellard	2016-04-30	9	-18/+324
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
*	AMDGPU: Fix crash with unreachable terminators.	Matt Arsenault	2016-04-29	1	-12/+27
\| \| \| \| \| \| \| \| \| \|	If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119
*	Differential Revision: http://reviews.llvm.org/D19733	Sriraman Tallam	2016-04-29	3	-4/+3
\| \| \| \|	llvm-svn: 268106
*	AMDGPU: Add kernarg.segment.ptr intrinsic	Matt Arsenault	2016-04-29	1	-0/+5
\| \| \| \|	llvm-svn: 268105
*	AMDGPU/SI: Move post regalloc run of SIShrinkInstructions	Matt Arsenault	2016-04-29	1	-5/+1
\| \| \| \| \| \| \| \|	Move to addPreEmitPass. This is so it runs after post-RA scheduling so we can merge s_nops emitted by the scheduler and hazard recognizer. llvm-svn: 268095
*	Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. ↵	Artem Tamazov	2016-04-29	6	-19/+39
\| \| \| \| \| \| \| \| \| \| \|	Rework M0 exclusion for SMRD." Previously reverted by r267752. r267733 review: Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 268066
*	[PPC] Enable shuffling of VSX vectors	Guozhi Wei	2016-04-29	1	-4/+2
\| \| \| \| \| \|	This patch fixes PR27078 by enabling shuffling of vectors if VSX is available. llvm-svn: 268064
*	[mips][ias] Move createCpRestoreMemOp to MipsTargetStreamer. NFC.	Daniel Sanders	2016-04-29	3	-38/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This removes the temporary call to isIntegratedAssemblerRequired() which was added recently. It's effect is now acheived directly in the MipsTargetStreamer hierarchy. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19715 llvm-svn: 268058
*	Fix NDEBUG build: variables used only in debug code causing compile error	Krzysztof Parzyszek	2016-04-29	1	-4/+8
\| \| \| \|	llvm-svn: 268057
*	[mips][FastISel] A store is not a load.	Simon Dardis	2016-04-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Correct trivial error. One of the failing tests from PR/27458. Reviewers: dsanders, vkalintiris, mcrosier Differential Review: http://reviews.llvm.org/D19726 llvm-svn: 268053