bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NVPTX] Reduce amount of boilerplate code used to select load instruction ↵	Artem Belevich	2017-03-02	1	-1781/+587
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	opcode. Make opcode selection code for the load instruction a bit easier to read and maintain. This patch also catches number of f16 load/store variants that were not handled before. Differential Revision: https://reviews.llvm.org/D30513 llvm-svn: 296785
*	[NVPTX] Added support for .f16x2 instructions.	Artem Belevich	2017-02-23	1	-26/+320
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables support for .f16x2 operations. Added new register type Float16x2. Added support for .f16x2 instructions. Added handling of vectorized loads/stores of v2f16 values. Differential Revision: https://reviews.llvm.org/D30057 Differential Revision: https://reviews.llvm.org/D30310 llvm-svn: 296032
*	[NVPTX] Move getDivF32Level, usePrecSqrtF32, and useF32FTZ into out of ↵	Justin Lebar	2017-01-21	1	-46/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAGToDAG and into TargetLowering. Summary: DADToDAG has access to TargetLowering, but not vice versa, so this is the more general location for these functions. NFC Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28795 llvm-svn: 292693
*	[NVPTX] Added support for half-precision floating point.	Artem Belevich	2017-01-13	1	-3/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only scalar half-precision operations are supported at the moment. - Adds general support for 'half' type in NVPTX. - fp16 math operations are supported on sm_53+ GPUs only (can be disabled with --nvptx-no-f16-math). - Type conversions to/from fp16 are supported on all GPU variants. - On GPU variants that do not have full fp16 support (or if it's disabled), fp16 operations are promoted to fp32 and results are converted back to fp16 for storage. Differential Revision: https://reviews.llvm.org/D28540 llvm-svn: 291956
*	[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is ↵	Artem Belevich	2017-01-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	allowed. Previously we'd always lower @llvm.{sin,cos}.f32 to {sin.cos}.approx.f32 instruction even when unsafe FP math was not allowed. Clang-generated IR is not affected by this as it uses precise sin/cos from CUDA's libdevice when unsafe math is disabled. Differential Revision: https://reviews.llvm.org/D28619 llvm-svn: 291936
*	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-7/+6
\| \| \| \|	llvm-svn: 281493
*	[NVPTX] Use ldg for explicitly invariant loads.	Justin Lebar	2016-09-11	1	-13/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With this change (plus some changes to prevent !invariant from being clobbered within llvm), clang will be able to model the __ldg CUDA builtin as an invariant load, rather than as a target-specific llvm intrinsic. This will let the optimizer play with these loads -- specifically, we should be able to vectorize them in the load-store vectorizer. Reviewers: tra Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc Differential Revision: https://reviews.llvm.org/D23477 llvm-svn: 281152
*	[NVPTX] Improve lowering of byval args of device functions.	Artem Belevich	2016-07-20	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Avoid unnecessary spills of byval arguments of device functions to local space on SASS level and subsequent pointer conversion to generic address space that follows. Instead, make a local copy in IR, provide a way to access arguments directly, and let LLVM optimize the copy away when possible. Differential Review: https://reviews.llvm.org/D21421 llvm-svn: 276153
*	Revert r273313 "[NVPTX] Improve lowering of byval args of device functions."	Artem Belevich	2016-06-29	1	-52/+0
\| \| \| \| \| \|	The change causes llvm crash in some unoptimized builds. llvm-svn: 274163
*	[NVPTX] Improve lowering of byval args of device functions.	Artem Belevich	2016-06-21	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid unnecessary spills of such vars to local space on SASS level and pointer space conversion. Instead, make a local copy with appropriate addrspacecasts and let LLVM optimize them away when possible. This allows loading value of the argument using [symbol+offset] instead of converting argument to general space pointer and using it for indexing (which also implicitly converts param space pointer to local space one on SASS level and triggers copying of argument into local space in the process). This reduces call overhead, uses less registers and reduces overall SASS size by 2-4%. Differential Review: http://reviews.llvm.org/D21421 llvm-svn: 273313
*	SDAG: Implement Select instead of SelectImpl in NVPTXDAGToDAGISel	Justin Bogner	2016-05-13	1	-197/+220
\| \| \| \| \| \| \| \| \| \|	- Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. Part of llvm.org/pr26808. llvm-svn: 269483
*	SDAG: Rename Select->SelectImpl and repurpose Select as returning void	Justin Bogner	2016-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693
*	[NVPTX] Fix sign/zero-extending ldg/ldu instruction selection	Justin Holewinski	2016-05-02	1	-48/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We don't have sign-/zero-extending ldg/ldu instructions defined, so we need to emulate them with explicit CVTs. We were originally handling the i8 case, but not any other cases. Fixes PR26185 Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D19615 llvm-svn: 268272
*	[NVPTX] Handle ldg created from sign-/zero-extended load	Justin Holewinski	2016-04-05	1	-4/+81
\| \| \| \| \| \| \| \| \| \|	Reviewers: jingyue Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D18053 llvm-svn: 265389
*	[NVPTX] Use LDG for pointer induction variables.	Bjarke Hammersholt Roune	2015-08-05	1	-10/+29
\| \| \| \| \| \| \| \|	More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables. Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads. llvm-svn: 244166
*	De-constify pointers to Type since they can't be modified. NFC	Craig Topper	2015-08-01	1	-2/+2
\| \| \| \| \| \|	This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
*	[NVPTX] make load on global readonly memory to use ldg	Jingyue Wu	2015-07-20	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt_kernel and rwdot_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713
*	[NVPTX] roll forward r239082	Jingyue Wu	2015-06-04	1	-0/+4
\| \| \| \| \| \| \| \| \|	NVPTXISelDAGToDAG translates "addrspacecast to param" to NVPTX::nvvm_ptr_gen_to_param Added an llc test in bug21465. llvm-svn: 239100
*	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"	Sergey Dmitrouk	2015-04-28	1	-64/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
*	Revert "[DebugInfo] Add debug locations to constant SD nodes"	Daniel Jasper	2015-04-28	1	-66/+64
\| \| \| \| \| \| \|	This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
*	[DebugInfo] Add debug locations to constant SD nodes	Sergey Dmitrouk	2015-04-28	1	-64/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
*	Simplify boolean expressions with true and false using clang-tidy	Eli Bendersky	2015-03-23	1	-4/+1
\| \| \| \| \| \| \| \|	Patch by Richard (legalize@xmission.com) Differential Revision: http://reviews.llvm.org/D8521 llvm-svn: 232961
*	Recommit r232027 with PR22883 fixed: Add infrastructure for support of ↵	Daniel Sanders	2015-03-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple memory constraints. The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. PR22883 was caused the matching operands copying the whole of the operand flags for the matched operand. This included the constraint id which needed to be replaced with the operand number. This has been fixed with a conversion function. Following on from this, matching operands also used the operand number as the constraint id. This has been fixed by looking up the matched operand and taking it from there. llvm-svn: 232165
*	Revert "r232027 - Add infrastructure for support of multiple memory constraints"	Hal Finkel	2015-03-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This (r232027) has caused PR22883; so it seems those bits might be used by something else after all. Reverting until we can figure out what else to do. Original commit message: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. llvm-svn: 232093
*	Add infrastructure for support of multiple memory constraints.	Daniel Sanders	2015-03-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8171 llvm-svn: 232027
*	Remove all use of is64bit off of NVPTXSubtarget and clean up code	Eric Christopher	2015-02-19	1	-54/+41
\| \| \| \| \| \| \|	accordingly. This changes the constructors of a number of classes that don't need to know the subtarget's 64-bitness. llvm-svn: 229787
*	NVPTX: Canonicalize access to function attributes, NFC	Duncan P. N. Exon Smith	2015-02-14	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229260
*	MathExtras: Bring Count(Trailing\|Leading)Ones and CountPopulation in line ↵	Benjamin Kramer	2015-02-12	1	-3/+3
\| \| \| \| \| \| \| \|	with countTrailingZeros Update all callers. llvm-svn: 228930
*	Remove unused argument.	Eric Christopher	2015-01-30	1	-6/+5
\| \| \| \|	llvm-svn: 227539
*	Migrate NVPTXISelDAGToDAG's getSubtarget to a runOnMachineFunction	Eric Christopher	2015-01-30	1	-31/+35
\| \| \| \| \| \|	version. Update NVPTXInstrInfo accordingly. llvm-svn: 227538
*	[NVPTX] Remove MemIntrinsicSDNode/MemSDNode duplicate checking	Hal Finkel	2014-08-13	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \|	As of r214452, isa<MemSDNode> will return true for nodes for which isa<MemIntrinsicSDNode> will return true (classof now respects the actual class hierarchy). So we no longer need to check for both MemIntrinsicSDNode and MemSDNode separately. No functionality change intended. llvm-svn: 215523
*	Fix typos:	Sylvestre Ledru	2014-08-11	1	-1/+1
\| \| \| \| \| \| \|	* libaries => libraries * avaiable => available llvm-svn: 215366
*	[NVPTX] Silence a GCC warning found by the buildbots	Justin Holewinski	2014-07-23	1	-1/+1
\| \| \| \| \| \| \|	The cast to NVPTXTargetLowering was missing a 'const', but let's just access the right pointer through the subtarget anyway. llvm-svn: 213793
*	[NVPTX] Improve handling of FP fusion	Justin Holewinski	2014-07-17	1	-19/+5
\| \| \| \| \| \| \| \| \|	We now consider the FPOpFusion flag when determining whether to fuse ops. We also explicitly emit add.rn when fusion is disabled to prevent ptxas from fusing the operations on its own. llvm-svn: 213287
*	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture ↵	Justin Holewinski	2014-07-17	1	-82/+1574
\| \| \| \| \| \| \| \| \| \| \|	fetch This also uses TSFlags to mark machine instructions that are surface/texture accesses, as well as the vector width for surface operations. This is used to simplify some of the switch statements that need to detect surface/texture instructions llvm-svn: 213256
*	[NVPTX] Fix handling of ldg/ldu intrinsics.	Justin Holewinski	2014-06-27	1	-6/+299
\| \| \| \| \| \| \| \| \| \|	The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g. %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0 !0 = metadata !{i32 4} llvm-svn: 211939
*	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns	Justin Holewinski	2014-06-27	1	-5/+8
\| \| \| \| \| \|	This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer llvm-svn: 211935
*	[NVPTX] Add isel patterns for bit-field extract (bfe)	Justin Holewinski	2014-06-27	1	-0/+214
\| \| \| \|	llvm-svn: 211932
*	Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I ↵	Craig Topper	2014-04-30	1	-6/+3
\| \| \| \| \| \|	introduced most of these recently. llvm-svn: 207616
*	[C++] Use 'nullptr'. Target edition.	Craig Topper	2014-04-25	1	-109/+109
\| \| \| \|	llvm-svn: 207197
*	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE	Chandler Carruth	2014-04-22	1	-1/+2
\| \| \| \| \| \| \|	definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
*	[Modules] Consolidate the DEBUG_TYPE defines in NVPTX to the top of the	Chandler Carruth	2014-04-21	1	-3/+1
\| \| \| \| \| \|	cpp file rather than in the header and then again in the cpp file. llvm-svn: 206778
*	Convert SelectionDAG::getVTList to use ArrayRef	Craig Topper	2014-04-16	1	-1/+1
\| \| \| \|	llvm-svn: 206357
*	Break PseudoSourceValue out of the Value hierarchy. It is now the root of ↵	Nick Lewycky	2014-04-15	1	-3/+7
\| \| \| \| \| \|	its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255
*	[NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces	Justin Holewinski	2014-04-09	1	-0/+592
\| \| \| \| \| \|	This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). llvm-svn: 205907
*	[NVPTX] Add isel patterns for addrspacecast	Justin Holewinski	2014-03-24	1	-0/+63
\| \| \| \|	llvm-svn: 204600
*	remove a bunch of unused private methods	Nuno Lopes	2014-03-23	1	-21/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h \| 1 include/llvm/IR/DebugInfo.h \| 3 lib/CodeGen/MachineSSAUpdater.cpp \| 10 -- lib/CodeGen/PostRASchedulerList.cpp \| 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp \| 10 -- lib/IR/DebugInfo.cpp \| 12 -- lib/MC/MCAsmStreamer.cpp \| 2 lib/Support/YAMLParser.cpp \| 39 --------- lib/TableGen/TGParser.cpp \| 16 --- lib/TableGen/TGParser.h \| 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp \| 9 -- lib/Target/ARM/ARMCodeEmitter.cpp \| 12 -- lib/Target/ARM/ARMFastISel.cpp \| 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp \| 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp \| 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp \| 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h \| 2 lib/Target/PowerPC/PPCFastISel.cpp \| 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp \| 2 lib/Transforms/Instrumentation/BoundsChecking.cpp \| 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp \| 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp \| 8 - lib/Transforms/Scalar/SCCP.cpp \| 1 utils/TableGen/CodeEmitterGen.cpp \| 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
*	[NVPTX] Fix off-by-one error when creating the VT list for an SDNode	Justin Holewinski	2013-12-05	1	-1/+1
\| \| \| \|	llvm-svn: 196503
*	Mark some command line flags as hidden	Nadav Rotem	2013-10-18	1	-4/+4
\| \| \| \|	llvm-svn: 193013
*	ISelDAG: spot chain cycles involving MachineNodes	Tim Northover	2013-09-22	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165