bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Consider XOR in waterfall loop as a terminator	Scott Linder	2019-02-05	1	-0/+115
\| \| \| \| \| \| \| \|	Ensure the XOR in the waterfall loop for indirect addressing is considered a terminator. Differential Revision: https://reviews.llvm.org/D57703 llvm-svn: 353207
*	AMDGPU: Fix assert on trunc from bitcast of build_vector	Matt Arsenault	2019-02-05	1	-2/+18
\| \| \| \| \| \| \| \| \|	The v2i64 argument is lowered to a bitcast of v4i32 build_vector. This would then attempt to use the i32-element as the source of the vector truncate. This really would need to collect 2 elements from the build_vector to produce the intended truncate. llvm-svn: 353202
*	GlobalISel: Consolidate load/store legalization	Matt Arsenault	2019-02-05	2	-51/+88
\| \| \| \| \| \| \| \| \| \|	The fewerElementsVectors implementation for load/stores handles the scalar reduction case just as well, so drop the redundant code in narrowScalar. This also introduces support for narrowing irregular size breakdowns for scalars. llvm-svn: 353125
*	GlobalISel: Implement narrowScalar for select	Matt Arsenault	2019-02-05	1	-2/+143
\| \| \| \| \| \| \| \| \| \|	Don't handle vector conditions. I think this can be merged in the future with fewerElementsVectorSelect, although this becomes slightly tricky with a vector condition. llvm-svn: 353122
*	GlobalISel: Combine g_extract with g_merge_values	Matt Arsenault	2019-02-04	1	-0/+470
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to use the underlying source registers. This enables legalization in more cases where some irregular operations are widened and others narrowed. This seems to make the test_combines_2 AArch64 test worse, since the MERGE_VALUES has multiple uses. Since this should be required for legalization, a hasOneUse check is probably inappropriate (or maybe should only be used if the merge is legal?). llvm-svn: 353121
*	GlobalISel: Enforce operand types for constants	Matt Arsenault	2019-02-04	13	-46/+47
\| \| \| \| \| \| \| \|	A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
*	GlobalISel: Fix not calling observer when legalizing bitcount ops	Matt Arsenault	2019-02-04	5	-20/+41
\| \| \| \| \| \|	This was hiding bugs from never legalizing the source type. llvm-svn: 353102
*	AMDGPU: Don't rematerialize mov with implicit operands	Matt Arsenault	2019-02-04	1	-0/+112
\| \| \| \| \| \| \|	This was pulling the mov used for register indexing on gfx9 out of the loop. llvm-svn: 353101
*	[AMDGPU] Support emitting GOT relocations for function calls	Scott Linder	2019-02-04	1	-0/+51
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57416 llvm-svn: 353083
*	AMDGPU/GlobalISel: Legalize select for v4s16	Matt Arsenault	2019-02-04	1	-1/+206
\| \| \| \| \| \| \|	Also add some more select tests to help show future legalization changes. llvm-svn: 353045
*	GlobalISel: Implement widenScalar for G_UNMERGE_VALUES	Matt Arsenault	2019-02-03	1	-1/+232
\| \| \| \| \| \| \| \| \|	For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979
*	GlobalISel: Implement widenScalar for G_EXTRACT vector sources	Matt Arsenault	2019-02-02	1	-0/+132
\| \| \| \| \| \|	Handle the basic element extract case. llvm-svn: 352978
*	AMDGPU/GlobalISel: Legalize icmp for pointer types	Matt Arsenault	2019-02-02	1	-0/+175
\| \| \| \|	llvm-svn: 352976
*	AMDGPU/GlobalISel: Legalize constant for pointer types	Matt Arsenault	2019-02-02	1	-0/+84
\| \| \| \|	llvm-svn: 352975
*	AMDGPU/GlobalISel: Legalize select for pointer types	Matt Arsenault	2019-02-02	1	-76/+514
\| \| \| \|	llvm-svn: 352974
*	GlobalISel: Legalization for inttoptr/ptrtoint	Matt Arsenault	2019-02-02	2	-28/+323
\| \| \| \|	llvm-svn: 352973
*	[AMDGPU] Mark test functions with hidden visibility	Scott Linder	2019-02-01	8	-70/+70
\| \| \| \| \| \| \| \| \|	Prepare for future patch which affects codegen for calls to preemptible functions. Differential Revision: https://reviews.llvm.org/D57605 llvm-svn: 352920
*	[opaque pointer types] Pass value type to LoadInst creation.	James Y Knight	2019-02-01	1	-16/+12
\| \| \| \| \| \| \| \| \|	This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
*	[AMDGPU] Fix for vector element insertion	Tim Corringham	2019-02-01	6	-34/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Incorrect code was generated when lowering insertelement operations for vectors with 8 or 16 bit elements. The value being inserted was not adjusted for the position of the element within the 32 bit word and so only the low element within each 32 bit word could receive the intended value. Fixed by simply replicating the value to each element of a congruent vector before the mask and or operation used to update the intended element. A number of affected LIT tests have been updated appropriately. before the mask & or into the intended Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Tags: #llvm Differential Revision: https://reviews.llvm.org/D57588 llvm-svn: 352885
*	GlobalISel: Handle odd splits in fewerElementsVector for load/store	Matt Arsenault	2019-01-31	2	-123/+248
\| \| \| \|	llvm-svn: 352720
*	GlobalISel: Implement narrowScalar for bswap	Matt Arsenault	2019-01-31	1	-0/+125
\| \| \| \|	llvm-svn: 352719
*	GlobalISel: Allow bitcount ops to have different result type	Matt Arsenault	2019-01-31	5	-0/+440
\| \| \| \| \| \|	For AMDGPU the result is always 32-bit for 64-bit inputs. llvm-svn: 352717
*	Add a 'dynamic' parameter to the objectsize intrinsic	Erik Pilkington	2019-01-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664
*	GlobalISel: Implement fewerElementsVector for select	Matt Arsenault	2019-01-30	1	-0/+209
\| \| \| \|	llvm-svn: 352601
*	AMDGPU/GlobalISel: Fix clamping shifts with 16-bit insts	Matt Arsenault	2019-01-30	3	-0/+126
\| \| \| \|	llvm-svn: 352599
*	GlobalISel: Support narrowScalar for uneven loads	Matt Arsenault	2019-01-30	1	-1/+120
\| \| \| \|	llvm-svn: 352594
*	GlobalISel: Handle some odd splits in fewerElementsVector	Matt Arsenault	2019-01-30	1	-3/+75
\| \| \| \| \| \|	Also add some quick hacks to AMDGPU legality for the tests. llvm-svn: 352591
*	GlobalISel: Handle more cases for widenScalar for G_STORE	Matt Arsenault	2019-01-30	1	-0/+99
\| \| \| \|	llvm-svn: 352585
*	GlobalISel: Verify pointer casts	Matt Arsenault	2019-01-29	1	-8/+8
\| \| \| \| \| \| \|	Not sure if the old AArch64 tests should be just deleted or not. llvm-svn: 352562
*	GlobalISel: Partially implement widenScalar for MERGE_VALUES	Matt Arsenault	2019-01-29	1	-0/+156
\| \| \| \|	llvm-svn: 352560
*	GlobalISel: Fix narrowScalar for load/store with different mem size	Matt Arsenault	2019-01-29	2	-0/+132
\| \| \| \| \| \| \| \| \| \|	This was ignoring the memory size, and producing multiple loads/stores if the operand size was different from the memory size. I assume this is the intent of not having an explicit G_ANYEXTLOAD (although I think that would probably be better). llvm-svn: 352523
*	[AMDGPU] Fix a weird WWM intrinsic issue.	Neil Henning	2019-01-29	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \|	I found a really strange WWM issue through a very convoluted shader that essentially boils down to a bug in SIInstrInfo where canReadVGPR did not correctly identify that WWM is like a copy and can have a VGPR as its source. Differential Revision: https://reviews.llvm.org/D56002 llvm-svn: 352500
*	AMDGPU: Add DS append/consume intrinsics	Matt Arsenault	2019-01-28	2	-0/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since these pass the pointer in m0 unlike other DS instructions, these need to worry about whether the address is uniform or not. This assumes the address is dynamically uniform, and just uses readfirstlane to get a copy into an SGPR. I don't know if these have the same 16-bit add for the addressing mode offset problem on SI or not, but I've just assumed they do. Also includes some misc. changes to avoid test differences between the LDS and GDS versions. llvm-svn: 352422
*	[AMDGPU] Add intrinsics for 16 bit interpolation	Tim Corringham	2019-01-28	1	-0/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357
*	GlobalISel: Don't reduce elements for atomic load/store	Matt Arsenault	2019-01-27	1	-0/+46
\| \| \| \| \| \| \|	This is invalid for the same reason as in the narrowScalar handling for load. llvm-svn: 352334
*	GlobalISel: Verify load/store has a pointer input	Matt Arsenault	2019-01-27	13	-38/+38
\| \| \| \| \| \| \|	I expected this to be automatically verified, but it seems nothing uses that the type index was declared as a "ptype" llvm-svn: 352319
*	GlobalISel: Implement narrowScalar for mul	Matt Arsenault	2019-01-27	1	-0/+26
\| \| \| \|	llvm-svn: 352300
*	GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_round	Matt Arsenault	2019-01-27	2	-8/+92
\| \| \| \|	llvm-svn: 352298
*	AMDGPU/GlobalISel: Legalize more bit ops	Matt Arsenault	2019-01-26	3	-24/+567
\| \| \| \|	llvm-svn: 352295
*	AMDGPU/GlobalISel: Widen small uaddo/usubo	Matt Arsenault	2019-01-26	2	-0/+194
\| \| \| \|	llvm-svn: 352294
*	[MBP] Don't move bottom block before header if it can't reduce taken branches	Guozhi Wei	2019-01-25	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case: -->OldTop<- \| . \| \| . \| \| . \| ---Pred \| \| \| BB----- Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving. Differential Revision: https://reviews.llvm.org/D57067 llvm-svn: 352236
*	AMDGPU/GlobalISel: Scalarize add/sub	Matt Arsenault	2019-01-25	2	-2/+64
\| \| \| \|	llvm-svn: 352167
*	GlobalISel: fewerElementsVector for more cast types	Matt Arsenault	2019-01-25	4	-0/+151
\| \| \| \|	llvm-svn: 352166
*	GlobalISel: fewerElementsVector for a few more trivial ops	Matt Arsenault	2019-01-25	6	-0/+332
\| \| \| \|	llvm-svn: 352165
*	AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul	Matt Arsenault	2019-01-25	5	-2/+238
\| \| \| \|	llvm-svn: 352162
*	GlobalISel: Support fewerElementsVector for icmp/fcmp	Matt Arsenault	2019-01-25	2	-13/+285
\| \| \| \| \| \|	Also legalize 64-bit compares for AMDGPU llvm-svn: 352157
*	GlobalISel: Implement fewerElementsVector for extensions	Matt Arsenault	2019-01-25	5	-7/+483
\| \| \| \|	llvm-svn: 352155
*	[GISel]: Change how CSE is enabled by default for each pass	Aditya Nandakumar	2019-01-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126
*	RegBankSelect: Support some more complex part mappings	Matt Arsenault	2019-01-24	1	-0/+386
\| \| \| \|	llvm-svn: 352123
*	[AMDGPU] With XNACK, cannot clause a load with result coalesced with operand	Tim Renouf	2019-01-23	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With XNACK, an smem load whose result is coalesced with an operand (thus it overwrites its own operand) cannot appear in a clause, because some other instruction might XNACK and restart the whole clause. The clause breaker already realized that an smem that overwrites an operand cannot appear in a clause, and broke the clause. The problem that this commit fixes is that the SIFormMemoryClauses optimization formed a bundle with early clobber, which caused the earlier code that set up the coalesced operand to be removed as dead. Differential Revision: https://reviews.llvm.org/D57008 Change-Id: I703c4d5b0bf7d6060222bec491f45c18bb3c0016 llvm-svn: 351950