bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Don't wait at end of block with a trivial successor	Matt Arsenault	2017-03-08	1	-2/+14
\| \| \| \| \| \| \| \| \| \|	If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. llvm-svn: 297251
*	AMDGPU: Constant fold rcp node	Matt Arsenault	2017-03-08	1	-2/+12
\| \| \| \| \| \| \|	When doing arcp optimization with a constant denominator, this was leaving behind rcps with constant inputs. llvm-svn: 297248
*	AMDGPU/SI: Do not insert EndCf in an unreachable block	Changpeng Fang	2017-03-07	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D22025 llvm-svn: 297243
*	Recommit: [globalisel] Change LLT constructor string into an LLT-based ↵	Daniel Sanders	2017-03-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297241
*	Revert r297177: Change LLT constructor string into an LLT-based object ...	Daniel Sanders	2017-03-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	More module problems. This time it only showed up in the stage 2 compile of clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile. Somehow, this change causes the build to need Attributes.gen before it's been generated. llvm-svn: 297188
*	[globalisel] Change LLT constructor string into an LLT-based object that ↵	Daniel Sanders	2017-03-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297177
*	Revert "AMDGPU: Set MCAsmInfo::PointerSize"	Konstantin Zhuravlyov	2017-03-07	1	-1/+0
\| \| \| \| \| \| \| \|	It breaks line tables because the patch is not complete, working on a complete one at the moment This reverts commit r294031. llvm-svn: 297118
*	AMDGPU/R600: Fix ALU clause markers use detection	Jan Vesely	2017-03-06	1	-2/+5
\| \| \| \| \| \| \| \|	also exit early on kill instead of redefinition. Differential Revision: https://reviews.llvm.org/D30230 llvm-svn: 297060
*	Make TargetInstrInfo::isPredicable take a const reference, NFC	Krzysztof Parzyszek	2017-03-03	2	-3/+3
\| \| \| \|	llvm-svn: 296901
*	[AMDGPU][MC] Fix for Bug 30829 + LIT tests	Dmitry Preobrazhensky	2017-03-03	7	-0/+163
\| \| \| \| \| \| \| \|	Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. llvm-svn: 296873
*	AMDGPU: Fix missing dominator tree dependency	Matt Arsenault	2017-03-02	1	-0/+1
\| \| \| \|	llvm-svn: 296842
*	AMDGPU: Fix types for VOP_I16_I16_I16	Matt Arsenault	2017-02-28	1	-1/+1
\| \| \| \|	llvm-svn: 296523
*	AMDGPU: Add definition for v_swap_b32	Matt Arsenault	2017-02-28	1	-4/+31
\| \| \| \| \| \| \| \|	This is somewhat tricky because there are two pairs of tied operands, and it isn't allowed to be VOP3 encoded. llvm-svn: 296519
*	AMDGPU: Add definition for v_xad_u32	Matt Arsenault	2017-02-28	1	-0/+2
\| \| \| \|	llvm-svn: 296515
*	AMDGPU: Add ds_nop to assembler	Matt Arsenault	2017-02-28	1	-1/+21
\| \| \| \|	llvm-svn: 296513
*	AMDGPU: Add definitions for ds_{read\|write}_b{96\|128}	Matt Arsenault	2017-02-28	1	-4/+19
\| \| \| \| \| \| \| \| \|	It's not clear to me if this is always better than doing ds_write2_b64 This adds the constraint of a 128-bit register input instead of a pair of 64-bit. llvm-svn: 296512
*	[AMDGPU] Add second pass of the scheduler	Stanislav Mekhanoshin	2017-02-28	2	-7/+126
\| \| \| \| \| \| \| \| \| \| \|	If during scheduling we have identified that we cannot keep optimistic occupancy increase critical register pressure limit and try scheduling of the whole function again. In this case blocks with smaller pressure will have a chance for better scheduling. Differential Revision: https://reviews.llvm.org/D30442 llvm-svn: 296506
*	[AMDGPU] New method to estimate register pressure	Stanislav Mekhanoshin	2017-02-28	2	-21/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces new method to estimate register pressure in GCNScheduler. Standard RPTracker gives huge error due to the following reasons: 1. It does not account for live-ins or live-outs if value is not used in the region itself. That creates a huge error in a very common case if there are a lot of live-thu registers. 2. It does not properly count subregs. 3. It assumes a register used as an input operand can be reused as an output. This is not always possible by itself, this is not what RA will finally do in many cases for various reasons not limited to RA's inability to do so, and this is not so if the value is actually a live-thu. In addition we can now see clear separation between live-in pressure which we cannot change with the scheduling and tentative pressure which we can change. Differential Revision: https://reviews.llvm.org/D30439 llvm-svn: 296491
*	[AMDGPU] Change amd_kernel_code_t's minor version to 1	Konstantin Zhuravlyov	2017-02-28	1	-1/+1
\| \| \| \| \| \| \| \|	- We do emit amd_kernel_code_t v1.1 Differential Revision: https://reviews.llvm.org/D30433 llvm-svn: 296489
*	[AMDGPU] Fix read-undef flags when schedule is reverted	Stanislav Mekhanoshin	2017-02-28	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	If two subregs of the same register are defined and we need to revert schedule changing def order, we will end up with both instructions having def,read-undef flags because adjustLaneLiveness() will only set this flag but will not remove it. Fix this by removing read-undef flags before calling adjustLaneLiveness. Differential Revision: https://reviews.llvm.org/D30428 llvm-svn: 296484
*	Revert r296474 - [globalisel] Change LLT constructor string into an LLT ↵	Daniel Sanders	2017-02-28	1	-1/+1
\| \| \| \| \| \| \| \|	subclass that knows how to generate it. There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1. llvm-svn: 296478
*	[globalisel] Change LLT constructor string into an LLT subclass that knows ↵	Daniel Sanders	2017-02-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 296474
*	AMDGPU: Use v_med3_{f16\|i16\|u16}	Matt Arsenault	2017-02-27	7	-33/+52
\| \| \| \|	llvm-svn: 296401
*	AMDGPU: Support v2i16/v2f16 packed operations	Matt Arsenault	2017-02-27	11	-63/+378
\| \| \| \|	llvm-svn: 296396
*	AMDGPU: Add some of the new gfx9 VOP3 instructions	Matt Arsenault	2017-02-27	1	-0/+12
\| \| \| \|	llvm-svn: 296382
*	AMDGPU: Support inlineasm for packed instructions	Matt Arsenault	2017-02-27	1	-1/+42
\| \| \| \| \| \| \|	Add packed types as legal so they may be used with inlineasm. Keep all operations expanded for now. llvm-svn: 296379
*	AMDGPU: Don't fold immediate if clamp/omod are set	Matt Arsenault	2017-02-27	2	-8/+13
\| \| \| \| \| \| \|	Doesn't fix any practical problems because clamp/omod are currently folded after peephole optimizer. llvm-svn: 296375
*	AMDGPU: Fold omod into instructions	Matt Arsenault	2017-02-27	3	-6/+146
\| \| \| \|	llvm-svn: 296372
*	AMDGPU: Add f16 to shader calling conventions	Matt Arsenault	2017-02-27	1	-3/+3
\| \| \| \| \| \|	Mostly useful for writing tests for f16 features. llvm-svn: 296370
*	AMDGPU: Add VOP3P instruction format	Matt Arsenault	2017-02-27	23	-86/+879
\| \| \| \| \| \| \| \|	Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
*	[AMDGPU] Runtime metadata fixes:	Konstantin Zhuravlyov	2017-02-27	5	-32/+79
\| \| \| \| \| \| \| \| \| \| \|	- Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it: .amdgpu_runtime_metadata { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ... - Make IsaInfo optional, and always emit it. Differential Revision: https://reviews.llvm.org/D30349 llvm-svn: 296324
*	AMDGPU : Replace FMAD with FMA when denormals are enabled.	Wei Ding	2017-02-24	4	-1/+20
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D29958 llvm-svn: 296186
*	Revert "Correct register pressure calculation in presence of subregs"	Stanislav Mekhanoshin	2017-02-24	2	-22/+0
\| \| \| \| \| \| \| \|	This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182
*	[AMDGPU] Shut the warning "getRegUnitWeight hides overload...". NFC.	Stanislav Mekhanoshin	2017-02-23	1	-0/+2
\| \| \| \| \| \| \|	Clang issues warning about hidden overload. That was intended, so add "using AMDGPUGenRegisterInfo::getRegUnitWeight;" to mute it. llvm-svn: 296021
*	Correct register pressure calculation in presence of subregs	Stanislav Mekhanoshin	2017-02-23	2	-0/+20
\| \| \| \| \| \| \| \| \| \|	If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 llvm-svn: 296009
*	AMDGPU/SI: Fix trunc i16 pattern	Jan Vesely	2017-02-23	2	-6/+5
\| \| \| \| \| \| \| \|	Hit on ASICs that support 16bit instructions. Differential Revision: https://reviews.llvm.org/D30281 llvm-svn: 295990
*	LoadStoreVectorizer: Split even sized illegal chains properly	Matt Arsenault	2017-02-23	2	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement isLegalToVectorizeLoadChain for AMDGPU to avoid producing private address spaces accesses that will need to be split up later. This was doing the wrong thing in the case where the queried chain was an even number of elements. A possible <4 x i32> store was being split into store <2 x i32> store i32 store i32 rather than store <2 x i32> store <2 x i32> when legal. llvm-svn: 295933
*	AMDGPU: Add another BFE pattern	Matt Arsenault	2017-02-23	3	-39/+52
\| \| \| \| \| \| \|	This is the pattern that falls out of the instruction's definition if offset == 0. llvm-svn: 295912
*	AMDGPU: Use clamp with f64	Matt Arsenault	2017-02-22	3	-7/+11
\| \| \| \|	llvm-svn: 295908
*	AMDGPU: Fold FP clamp as modifier bit	Matt Arsenault	2017-02-22	6	-6/+89
\| \| \| \| \| \| \| \| \| \| \|	The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905
*	AMDGPU : Update TrapCode based on Trap Handler ABI.	Wei Ding	2017-02-22	4	-13/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295904
*	AMDGPU: Add replacement bfe intrinsics	Matt Arsenault	2017-02-22	1	-0/+6
\| \| \| \|	llvm-svn: 295899
*	AMDGPU: Don't add emergency stack slot if all spills are SGPR->VGPR	Matt Arsenault	2017-02-22	1	-36/+55
\| \| \| \| \| \| \| \| \|	This should avoid reporting any stack needs to be allocated in the case where no stack is truly used. An unused stack slot is still left around in other cases where there are real stack objects but no spilling occurs. llvm-svn: 295891
*	AMDGPU: Don't look at chain users when adjusting writemask	Matt Arsenault	2017-02-22	1	-0/+4
\| \| \| \| \| \|	Fixes not adjusting using new intrinsics with chains. llvm-svn: 295878
*	AMDGPU: Always allocate emergency stack slot at offset 0	Matt Arsenault	2017-02-22	1	-5/+19
\| \| \| \| \| \| \| \| \|	This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877
*	AMDGPU: Change exp with compr bit printing	Matt Arsenault	2017-02-22	1	-3/+11
\| \| \| \|	llvm-svn: 295873
*	Revert "AMDGPU : Update TrapCode based on Trap Handler ABI."	Wei Ding	2017-02-22	4	-16/+12
\| \| \| \| \| \|	This reverts commit r295867. llvm-svn: 295871
*	AMDGPU : Update TrapCode based on Trap Handler ABI.	Wei Ding	2017-02-22	4	-12/+16
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295867
*	AMDGPU: Add cvt.pkrtz intrinsic	Matt Arsenault	2017-02-22	7	-5/+56
\| \| \| \| \| \|	Convert llvm.SI.packf16 test uses llvm-svn: 295797
*	AMDGPU: Remove llvm.AMDGPU.clamp intrinsic	Matt Arsenault	2017-02-21	2	-13/+0
\| \| \| \|	llvm-svn: 295789