bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Fix i1 fp_to_int	Matt Arsenault	2016-07-22	4	-7/+34
\| \| \| \| \| \| \|	R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. llvm-svn: 276435
*	AMDGPU: Don't reinvent transferSuccessorsAndUpdatePHIs	Matt Arsenault	2016-07-22	1	-26/+2
\| \| \| \|	llvm-svn: 276434
*	[AMDGPU] Emit read-only data to .rodata for hsa	Konstantin Zhuravlyov	2016-07-21	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22538 llvm-svn: 276298
*	AMDGPU/SI: Add support for R_AMDGPU_ABS32	Konstantin Zhuravlyov	2016-07-21	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D21646 llvm-svn: 276294
*	[AMDGPU] Some code cleaning in SIRegisterInfo.td	Sam Kolton	2016-07-21	1	-33/+23
\| \| \| \| \| \| \| \| \| \|	Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: https://reviews.llvm.org/D22620 llvm-svn: 276274
*	AMDGPU: Fix phis from blocks split due to register indexing	Matt Arsenault	2016-07-21	1	-15/+22
\| \| \| \|	llvm-svn: 276257
*	AMDGPU: Fix bug causing crash due to invalid opencl version metadata.	Yaxun Liu	2016-07-20	1	-9/+13
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22526 llvm-svn: 276119
*	AMDGPU: Change fdiv lowering based on !fpmath metadata	Matt Arsenault	2016-07-19	8	-49/+227
\| \| \| \| \| \| \| \| \| \| \|	If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. llvm-svn: 276051
*	[AMDGPU] Remove spurious line (should've been removed in r276029).	Davide Italiano	2016-07-19	1	-3/+0
\| \| \| \|	llvm-svn: 276030
*	[AMDGPU] Remove dead code.	Davide Italiano	2016-07-19	1	-25/+0
\| \| \| \| \| \|	LGTM'd by Matt Arsenault. llvm-svn: 276029
*	AMDGPU: Only use legal inline immediates with kill pseudo	Matt Arsenault	2016-07-19	5	-3/+15
\| \| \| \| \| \| \| \| \| \| \|	Only if the value is negative or positive is what matters, so use a constant that doesn't require an instruction to materialize. These should really just emit the write exec directly, but for stick with the kill pseudo-terminator. llvm-svn: 275988
*	AMDGPU/SI: Fix SI scheduler refcount issue	Matt Arsenault	2016-07-19	1	-0/+3
\| \| \| \| \| \| \| \| \|	Without this fix, releaseSuccessors when InOrOutBlock is false could release SUs outside the schedule BasicBlock. Patch by Axel Davy llvm-svn: 275935
*	AMDGPU: Expand register indexing pseudos in custom inserter	Matt Arsenault	2016-07-19	8	-300/+451
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. llvm-svn: 275934
*	AMDGPU: Remove pointless dyn_cast_or_null	Matt Arsenault	2016-07-18	1	-4/+3
\| \| \| \| \| \|	This is already casted above so non-null llvm-svn: 275881
*	AMDGPU: Fix missing switch case warning	Matt Arsenault	2016-07-18	1	-0/+1
\| \| \| \|	llvm-svn: 275873
*	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32	Matt Arsenault	2016-07-18	5	-1/+8
\| \| \| \|	llvm-svn: 275871
*	AMDGPU/R600: Replace barrier intrinsics	Matt Arsenault	2016-07-18	3	-21/+1
\| \| \| \|	llvm-svn: 275870
*	AMDGPU: Remove dead check in AMDGPUPromoteAlloca	Matt Arsenault	2016-07-18	1	-9/+10
\| \| \| \| \| \| \| \| \| \|	This is currently only called with GEP users. A direct alloca would only happen with current typed pointers for arrays which are a perverse case. Also fix crashes on 0 x and 1 x arrays. llvm-svn: 275869
*	AMDGPU: Remove dead code and redundant check	Matt Arsenault	2016-07-18	1	-27/+1
\| \| \| \| \| \| \|	Non intrinsic calls aren't really handled, and this IntrinsicInst dyn_cast checks for the function for us. llvm-svn: 275868
*	AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions.	Nicolai Haehnle	2016-07-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The work item intrinsics are not available for the shader calling conventions. And even if we did hook them up most shader stages haves some extra restrictions on the amount of available LDS. Reviewers: tstellarAMD, arsenm Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D20728 llvm-svn: 275779
*	Re-commit [AMDGPU] Add metadata for runtime	Yaxun Liu	2016-07-16	3	-0/+371
\| \| \| \| \| \|	Attempting to fix lit test failure on ppc. llvm-svn: 275676
*	AMDGPU: Fix verifier error from partially undef copy	Matt Arsenault	2016-07-15	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this situation: %VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11, %VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use> %VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3 %VGPR4<def> = COPY %VGPR2 The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1, but VGPR4 is defined immediately after this copy. llvm-svn: 275635
*	AMDGPU: Remove brev intrinsic	Matt Arsenault	2016-07-15	2	-6/+0
\| \| \| \|	llvm-svn: 275620
*	AMDGPU: Fix TargetPrefix for remaining r600 intrinsics	Matt Arsenault	2016-07-15	3	-51/+53
\| \| \| \|	llvm-svn: 275619
*	AMDGPU: Remove AMDGPU.ldexp	Matt Arsenault	2016-07-15	1	-4/+0
\| \| \| \|	llvm-svn: 275618
*	AMDGPU: Remove legacy rsq.clamped intrinsic	Matt Arsenault	2016-07-15	4	-15/+7
\| \| \| \| \| \| \| \|	Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. llvm-svn: 275617
*	AMDGPU/R600: Delete dead code.	Matt Arsenault	2016-07-15	2	-58/+1
\| \| \| \| \| \|	Dead or the same as the base implementation. llvm-svn: 275616
*	Revert "[AMDGPU] Add metadata for runtime"	Vitaly Buka	2016-07-15	3	-371/+0
\| \| \| \| \| \|	This reverts commit r275566. llvm-svn: 275599
*	[SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, ↵	Justin Lebar	2016-07-15	3	-67/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592
*	[AMDGPU] Add metadata for runtime	Yaxun Liu	2016-07-15	3	-0/+371
\| \| \| \| \| \| \| \| \| \|	Added emitting metadata to elf for runtime. Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream. Differential Revision: https://reviews.llvm.org/D21849 llvm-svn: 275566
*	Rename AnalyzeBranch* to analyzeBranch*.	Jacques Pienaar	2016-07-15	4	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
*	AMDGPU: Fix not expanding control flow after some kill blocks	Matt Arsenault	2016-07-15	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Also stop trying to insert skip blocks at end_cf. This was inserting them at the end of the block which doesn't make sense. The skip should be inserted at the beginning of the block right after the end cf. Just remove this for now since no tests seem to stress this and I think this can be handled more generally later. Fixes bug 28550 llvm-svn: 275510
*	AMDGPU: Fix trying to skip from a block with no successors	Matt Arsenault	2016-07-15	1	-2/+3
\| \| \| \| \| \|	Found while reducing bug 28550 llvm-svn: 275509
*	AMDGPU: Fix splitting kill blocks with defs before kill	Matt Arsenault	2016-07-15	1	-13/+3
\| \| \| \|	llvm-svn: 275508
*	[AMDGPU] Assembler: fix row_bcast parsing	Sam Kolton	2016-07-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change fix bug 28538 Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: https://reviews.llvm.org/D22355 llvm-svn: 275422
*	AMDGPU/R600: Delete/rename intrinsics no longer used by mesa	Matt Arsenault	2016-07-14	7	-326/+7
\| \| \| \| \| \|	Use the replacement pass to update the tests, and delete old names. llvm-svn: 275375
*	AMDGPU/R600: Remove intrinsics with no tests and no users	Matt Arsenault	2016-07-14	4	-76/+15
\| \| \| \| \| \|	Mesa removed this path, so nothing is using these anymore. llvm-svn: 275372
*	AMDGPU: Remove unused intrinsics	Matt Arsenault	2016-07-14	2	-12/+0
\| \| \| \|	llvm-svn: 275371
*	AMDGPU: Remove dead code	Matt Arsenault	2016-07-14	2	-10/+0
\| \| \| \|	llvm-svn: 275369
*	AMDGPU: Remove last AMDIL intrinsics	Matt Arsenault	2016-07-13	2	-11/+1
\| \| \| \|	llvm-svn: 275309
*	AMDGPU/SI: Emit the number of SGPR and VGPR spills	Marek Olsak	2016-07-13	5	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: v2: don't count SGPRs spilled to scratch twice I think this is sufficient. It doesn't count private memory usage, which happens often and uses scratch but isn't technically a spill. The private memory usage can be computed by: [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills]. The fact SGPR spills add very high numbers to the scratch size make that computation a guessing game, but I don't have a solution to that. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D22197 llvm-svn: 275288
*	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL	Tom Stellard	2016-07-13	5	-28/+69
\| \| \| \| \| \| \| \| \| \|	Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 llvm-svn: 275268
*	AMDGPU: Fold out no-op kill intrinsics	Matt Arsenault	2016-07-13	1	-0/+8
\| \| \| \|	llvm-svn: 275253
*	AMDGPU: WQM cleanups	Matt Arsenault	2016-07-13	2	-42/+39
\| \| \| \| \| \| \| \|	- Add new TTI instruction checks - Don't use const for blocks that are mutated. - Checking isBranch and isTerminator should be redundant llvm-svn: 275252
*	AMDGPU: Follow up to r275203	Matt Arsenault	2016-07-12	5	-33/+101
\| \| \| \| \| \|	I meant to squash this into it. llvm-svn: 275220
*	AMDGPU: Fix verifier error with kill intrinsic	Matt Arsenault	2016-07-12	1	-65/+122
\| \| \| \| \| \| \|	Don't create a terminator in the middle of the block. We should probably get rid of this intrinsic. llvm-svn: 275203
*	AMDGPU: Set isConvergent on v_cmpx* instructions	Matt Arsenault	2016-07-12	1	-2/+3
\| \| \| \| \| \| \|	No test since these aren't used now, except for one place in a pre-emit pass. llvm-svn: 275200
*	AMDGPU: Add LLVM IR Intrinsic for v_lerp_u8	Wei Ding	2016-07-12	1	-0/+4
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D22239 llvm-svn: 275197
*	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset	Nicolai Haehnle	2016-07-12	3	-34/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160
*	AMDGPU: Cleanup pseudoinstructions	Matt Arsenault	2016-07-12	3	-58/+55
\| \| \| \|	llvm-svn: 275133