bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU : Replace FMAD with FMA when denormals are enabled.	Wei Ding	2017-02-24	4	-1/+20
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D29958 llvm-svn: 296186
*	Revert "Correct register pressure calculation in presence of subregs"	Stanislav Mekhanoshin	2017-02-24	2	-22/+0
\| \| \| \| \| \| \| \|	This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182
*	[AMDGPU] Shut the warning "getRegUnitWeight hides overload...". NFC.	Stanislav Mekhanoshin	2017-02-23	1	-0/+2
\| \| \| \| \| \| \|	Clang issues warning about hidden overload. That was intended, so add "using AMDGPUGenRegisterInfo::getRegUnitWeight;" to mute it. llvm-svn: 296021
*	Correct register pressure calculation in presence of subregs	Stanislav Mekhanoshin	2017-02-23	2	-0/+20
\| \| \| \| \| \| \| \| \| \|	If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 llvm-svn: 296009
*	AMDGPU/SI: Fix trunc i16 pattern	Jan Vesely	2017-02-23	2	-6/+5
\| \| \| \| \| \| \| \|	Hit on ASICs that support 16bit instructions. Differential Revision: https://reviews.llvm.org/D30281 llvm-svn: 295990
*	LoadStoreVectorizer: Split even sized illegal chains properly	Matt Arsenault	2017-02-23	2	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement isLegalToVectorizeLoadChain for AMDGPU to avoid producing private address spaces accesses that will need to be split up later. This was doing the wrong thing in the case where the queried chain was an even number of elements. A possible <4 x i32> store was being split into store <2 x i32> store i32 store i32 rather than store <2 x i32> store <2 x i32> when legal. llvm-svn: 295933
*	AMDGPU: Add another BFE pattern	Matt Arsenault	2017-02-23	3	-39/+52
\| \| \| \| \| \| \|	This is the pattern that falls out of the instruction's definition if offset == 0. llvm-svn: 295912
*	AMDGPU: Use clamp with f64	Matt Arsenault	2017-02-22	3	-7/+11
\| \| \| \|	llvm-svn: 295908
*	AMDGPU: Fold FP clamp as modifier bit	Matt Arsenault	2017-02-22	6	-6/+89
\| \| \| \| \| \| \| \| \| \| \|	The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905
*	AMDGPU : Update TrapCode based on Trap Handler ABI.	Wei Ding	2017-02-22	4	-13/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295904
*	AMDGPU: Add replacement bfe intrinsics	Matt Arsenault	2017-02-22	1	-0/+6
\| \| \| \|	llvm-svn: 295899
*	AMDGPU: Don't add emergency stack slot if all spills are SGPR->VGPR	Matt Arsenault	2017-02-22	1	-36/+55
\| \| \| \| \| \| \| \| \|	This should avoid reporting any stack needs to be allocated in the case where no stack is truly used. An unused stack slot is still left around in other cases where there are real stack objects but no spilling occurs. llvm-svn: 295891
*	AMDGPU: Don't look at chain users when adjusting writemask	Matt Arsenault	2017-02-22	1	-0/+4
\| \| \| \| \| \|	Fixes not adjusting using new intrinsics with chains. llvm-svn: 295878
*	AMDGPU: Always allocate emergency stack slot at offset 0	Matt Arsenault	2017-02-22	1	-5/+19
\| \| \| \| \| \| \| \| \|	This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877
*	AMDGPU: Change exp with compr bit printing	Matt Arsenault	2017-02-22	1	-3/+11
\| \| \| \|	llvm-svn: 295873
*	Revert "AMDGPU : Update TrapCode based on Trap Handler ABI."	Wei Ding	2017-02-22	4	-16/+12
\| \| \| \| \| \|	This reverts commit r295867. llvm-svn: 295871
*	AMDGPU : Update TrapCode based on Trap Handler ABI.	Wei Ding	2017-02-22	4	-12/+16
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295867
*	AMDGPU: Add cvt.pkrtz intrinsic	Matt Arsenault	2017-02-22	7	-5/+56
\| \| \| \| \| \|	Convert llvm.SI.packf16 test uses llvm-svn: 295797
*	AMDGPU: Remove llvm.AMDGPU.clamp intrinsic	Matt Arsenault	2017-02-21	2	-13/+0
\| \| \| \|	llvm-svn: 295789
*	AMDGPU: Redefine clamp node as clamp 0.0-1.0	Matt Arsenault	2017-02-21	12	-29/+163
\| \| \| \| \| \| \| \| \| \| \|	Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
*	AMDGPU: Formatting fixes	Matt Arsenault	2017-02-21	1	-4/+5
\| \| \| \|	llvm-svn: 295783
*	AMDGPU: Remove llvm.AMDGPU.flbit intrinsic	Matt Arsenault	2017-02-21	2	-4/+0
\| \| \| \|	llvm-svn: 295754
*	AMDGPU: Don't use stack space for SGPR->VGPR spills	Matt Arsenault	2017-02-21	8	-90/+225
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
*	AMDGPU: Fix assembler subtarget predicate for gfx9	Matt Arsenault	2017-02-18	3	-1/+13
\| \| \| \| \| \|	This was accepting GFX9 instructions on VI. llvm-svn: 295557
*	AMDGPU: Fix disassembly of aperture registers	Matt Arsenault	2017-02-18	1	-0/+5
\| \| \| \|	llvm-svn: 295555
*	AMDGPU: Merge initial gfx9 support	Matt Arsenault	2017-02-18	18	-41/+239
\| \| \| \|	llvm-svn: 295554
*	AMDGPU/R600: Assert on infinite loop in EmitClauseMarkers	Jan Vesely	2017-02-18	1	-3/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29792 llvm-svn: 295539
*	AMDGPU: Fix crashes on invalid icmp/fcmp intrinsics	Matt Arsenault	2017-02-17	1	-5/+9
\| \| \| \|	llvm-svn: 295489
*	AMDGPU: Remove llvm.AMDGPU.cube intrinsic	Matt Arsenault	2017-02-16	3	-25/+1
\| \| \| \|	llvm-svn: 295359
*	AMDGPU: Remove llvm.AMDGPU.rsq intrinsic	Matt Arsenault	2017-02-16	2	-6/+0
\| \| \| \|	llvm-svn: 295358
*	AMDGPU: Remove llvm.SI.sendmsg	Matt Arsenault	2017-02-16	2	-6/+3
\| \| \| \|	llvm-svn: 295270
*	AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics	Matt Arsenault	2017-02-16	3	-50/+3
\| \| \| \| \| \|	Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269
*	AMDGPU: Remove dead node definitions	Matt Arsenault	2017-02-15	1	-10/+0
\| \| \| \|	llvm-svn: 295247
*	AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests	Matt Arsenault	2017-02-15	1	-7/+4
\| \| \| \|	llvm-svn: 295244
*	AMDGPU: Replace assert with report_fatal_error	Matt Arsenault	2017-02-15	1	-1/+2
\| \| \| \| \| \|	Also use a more refined condition. llvm-svn: 295239
*	[AMDGPU] Revert failed scheduling	Stanislav Mekhanoshin	2017-02-15	3	-37/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts region's scheduling to the original untouched state in case if we have have decreased occupancy. In addition it switches to use TargetRegisterInfo occupancy callback for pressure limits instead of gradually increasing limits which were just passed by. We are going to stay with the best schedule so we do not need to tolerate worsened scheduling anymore. Differential Revision: https://reviews.llvm.org/D29971 llvm-svn: 295206
*	[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups	Stanislav Mekhanoshin	2017-02-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This patch corrects the maximum workgroups per CU if we have big workgroups (more than 128). This calculation contributes to the occupancy calculation in respect to LDS size. Differential Revision: https://reviews.llvm.org/D29974 llvm-svn: 295134
*	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track"	Alexander Timofeev	2017-02-14	2	-3/+4
\| \| \| \| \| \|	This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054
*	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵	Eugene Zelenko	2017-02-14	2	-15/+33
\| \| \| \| \| \| \| \|	minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009
*	AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This ↵	NAKAMURA Takumi	2017-02-12	1	-1/+1
\| \| \| \| \| \|	function returned true or undef. llvm-svn: 294895
*	AMDGPU: Fix trailing whitespace	Matt Arsenault	2017-02-10	5	-14/+13
\| \| \| \|	llvm-svn: 294694
*	AMDGPU : Add trap handler support.	Wei Ding	2017-02-10	10	-24/+99
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692
*	[AMDGPU] Override PSet for M0	Stanislav Mekhanoshin	2017-02-10	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	This change returns empty PSet list for M0 register. Otherwise its PSet as defined by tablegen is SReg_32. This results in incorrect register pressure calculation every time an instruction uses M0. Such uses count as SReg_32 PSet and inadequately increase pressure on SGPRs. Differential Revision: https://reviews.llvm.org/D29798 llvm-svn: 294691
*	AMDGPU: Add pass to expand memcpy/memmove/memset	Matt Arsenault	2017-02-09	5	-4/+136
\| \| \| \|	llvm-svn: 294635
*	[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of ↵	Konstantin Zhuravlyov	2017-02-09	2	-68/+31
\| \| \| \| \| \| \| \|	using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627
*	Drop graph_ prefix	Daniel Berlin	2017-02-09	1	-2/+2
\| \| \| \|	llvm-svn: 294621
*	GraphTraits: Add range versions of graph traits functions (graph_nodes, ↵	Daniel Berlin	2017-02-09	1	-12/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	graph_children, inverse_graph_nodes, inverse_graph_children). Summary: Convert all obvious node_begin/node_end and child_begin/child_end pairs to range based for. Sending for review in case someone has a good idea how to make graph_children able to be inferred. It looks like it would require changing GraphTraits to be two argument or something. I presume inference does not happen because it would have to check every GraphTraits in the world to see if the noderef types matched. Note: This change was 3-staged with clang as well, which uses Dominators/etc from LLVM. Reviewers: chandlerc, tstellarAMD, dblaikie, rsmith Subscribers: arsenm, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D29767 llvm-svn: 294620
*	[AMDGPU] Implement register pressure callbacks	Stanislav Mekhanoshin	2017-02-08	2	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement getRegPressureLimit and getRegPressureSetLimit callbacks in SIRegisterInfo. This makes standard converge scheduler to behave almost the same as GCNScheduler, sometime slightly better sometimes a bit worse. In gerenal that is also possible to switch GCNScheduler to use these callbacks instead of getMaxWaves(), which also makes GCNScheduler slightly better on some tests and slightly worse on another. A big win is behavior with converge scheduler. Note, these are used not only by scheduling, but in places like MachineLICM. Differential Revision: https://reviews.llvm.org/D29700 llvm-svn: 294518
*	[AMDGPU][NFC] Assign IsaInfo to reference variable in order to shorten long ↵	Konstantin Zhuravlyov	2017-02-08	1	-16/+13
\| \| \| \| \| \|	lines llvm-svn: 294454
*	[AMDGPU] Add target information that is required by tools to metadata	Konstantin Zhuravlyov	2017-02-08	13	-260/+616
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449