bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Convert assert(false) into llvm_unreachable where it makes sense.	Benjamin Kramer	2015-10-25	1	-3/+3
\| \| \| \|	llvm-svn: 251266
*	NVPTX: Remove implicit ilist iterator conversions, NFC	Duncan P. N. Exon Smith	2015-10-20	7	-25/+20
\| \| \| \|	llvm-svn: 250779
*	Use std::find instead of manual loop.	Craig Topper	2015-10-17	1	-5/+2
\| \| \| \|	llvm-svn: 250624
*	[NVPTX] Remove dead code.	Benjamin Kramer	2015-10-15	9	-222/+0
\| \| \| \| \| \|	I left helpers that look useful for debugging alone. NFC. llvm-svn: 250410
*	git-clang-format r249548.	Rafael Espindola	2015-10-07	1	-19/+19
\| \| \| \| \| \|	Sorry for missing this the first time. llvm-svn: 249610
*	Use non virtual destructors for sections.	Rafael Espindola	2015-10-07	2	-21/+21
\| \| \| \|	llvm-svn: 249548
*	Don't repeat names in comments and don't indent in namespaces. NFC.	Rafael Espindola	2015-10-07	1	-3/+2
\| \| \| \|	llvm-svn: 249546
*	Fix pr24486.	Rafael Espindola	2015-10-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303
*	MachineBasicBlock: Factor out common code into isReturnBlock()	Matthias Braun	2015-09-25	1	-1/+1
\| \| \| \|	llvm-svn: 248617
*	constify the Function parameter to the TTI creation callback and	Eric Christopher	2015-09-16	1	-1/+1
\| \| \| \| \| \|	propagate to all callers/users/etc. llvm-svn: 247864
*	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and ↵	Daniel Sanders	2015-09-15	4	-12/+11
\| \| \| \| \| \| \| \|	related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702
*	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* ↵	Daniel Sanders	2015-09-15	4	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692
*	Revert r247684 - Replace Triple with a new TargetTuple ...	Daniel Sanders	2015-09-15	4	-12/+11
\| \| \| \| \| \|	LLDB needs to be updated in the same commit. llvm-svn: 247686
*	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.	Daniel Sanders	2015-09-15	4	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683
*	Fix namespace indentation and missing blank lines before 'public:' in ↵	Daniel Sanders	2015-09-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	*MCAsmInfo.h. NFC. This is to reduce noise in a following commit. Also fixes a couple missing spaces before the reference operator. llvm-svn: 247679
*	Fix typos.	Bruce Mitchener	2015-09-12	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495
*	[ADT] Switch a bunch of places in LLVM that were doing single-character	Chandler Carruth	2015-09-10	1	-2/+2
\| \| \| \| \| \| \|	splits to actually use the single character split routine which does less work, and in a debug build is substantially faster. llvm-svn: 247245
*	[NVPTX] Added run NVVMReflect pass to NVPTX back-end.	Artem Belevich	2015-09-08	1	-0/+1
\| \| \| \| \| \| \| \| \|	The pass is needed to remove __nvvm_reflect calls when we link in libdevice bitcode that comes with CUDA. Differential Revision: http://reviews.llvm.org/D11663 llvm-svn: 247072
*	[NVPTX] Let NVPTX backend detect integer min and max patterns.	Bjarke Hammersholt Roune	2015-08-26	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support. Reviewers: jholewinski, meheff, jingyue Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D12377 llvm-svn: 246107
*	[NVPTX] Allow undef value as global initializer	Jingyue Wu	2015-08-22	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: __shared__ variable may now emit undef value as initializer, do not throw error on that. Test Plan: test/CodeGen/NVPTX/global-addrspace.ll Patch by Xuetian Weng Reviewers: jholewinski, tra, jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D12242 llvm-svn: 245785
*	[NVPTX] truncating 64-bit to 32-bit is free	Jingyue Wu	2015-08-20	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add an LSR test that exercises isTruncateFree. Without this change, LSR creates another indvar representing the truncated value. Reviewers: jholewinski, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D12058 llvm-svn: 245611
*	Use 32-bit divides instead of 64-bit divides where possible.	Mark Heffernan	2015-08-11	1	-0/+4
\| \| \| \| \| \| \| \| \|	For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason is that many index computations are carried out in 64-bits but never actually exceed the capacity of a 32-bit word. llvm-svn: 244684
*	Fix some comment typos.	Benjamin Kramer	2015-08-08	3	-3/+3
\| \| \| \|	llvm-svn: 244402
*	[NVPTX] Use LDG for pointer induction variables.	Bjarke Hammersholt Roune	2015-08-05	1	-10/+29
\| \| \| \| \| \| \| \|	More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables. Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads. llvm-svn: 244166
*	[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'	Chandler Carruth	2015-08-05	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rather than 'unsigned' for their costs. For something like costs in particular there is a natural "negative" value, that of savings or saved cost. As a consequence, there is a lot of code that subtracts or creates negative values based on cost, all of which is prone to awkwardness or bugs when dealing with an unsigned type. Similarly, we never want these values to wrap, as that would cause Very Bad code generation (likely percieved as an infinite loop as we try to emit over 2^32 instructions or some such insanity). All around 'int' seems a much better fit for these basic metrics. I've added asserts to ensure that at least the TTI interface never returns negative numbers here. If we ever have a use case for negative numbers, we can remove this, but this way a bug where someone used '-1' to produce a 'very large' cost will be caught by the assert. This passes all tests, and is also UBSan clean. No functional change intended. Differential Revision: http://reviews.llvm.org/D11741 llvm-svn: 244080
*	De-constify pointers to Type since they can't be modified. NFC	Craig Topper	2015-08-01	4	-25/+25
\| \| \| \| \| \|	This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
*	[NVPTX] allow register copy between float and int	Jingyue Wu	2015-08-01	1	-22/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class register copying (e.g. i32 to f32) becomes possible. Enhance NVPTXInstrInfo::copyPhysReg to handle these cases. Reviewers: jholewinski Subscribers: eliben, jholewinski, llvm-commits, bruno Differential Revision: http://reviews.llvm.org/D11622 llvm-svn: 243839
*	[NVPTX] convert pointers in byval kernel arguments to global	Jingyue Wu	2015-07-31	1	-20/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For example, in struct S { int x; int y; }; __global__ void foo(S s) { int *b = s.y; // use b } "b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for accessing "b". Reviewers: jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11505 llvm-svn: 243790
*	Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX	Jingyue Wu	2015-07-31	3	-37/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: rafael, echristo, chandlerc, bkramer, craig.topper, dexonsmith, chapuni, eliben, jingyue, jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9983 llvm-svn: 243734
*	Roll forward r242871	Jingyue Wu	2015-07-29	1	-1/+0
\| \| \| \| \| \| \|	r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555
*	Temporarily revert r242871	Jingyue Wu	2015-07-29	1	-0/+1
\| \| \| \| \| \|	PR24299 llvm-svn: 243522
*	[NVPTX] run LSR before straight-line optimizations	Jingyue Wu	2015-07-23	1	-5/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Straight-line optimizations can simplify the loop body and make LSR's cost analysis more precise. This significantly improves several Eigen3 CUDA benchmarks. With this change, EigenContractionKernel runs up to 40% faster (https://bitbucket.org/eigen/eigen/src/753ceee5f206ff7dde9f6a41a5a420749fc9406f/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h?at=default#cl-502). EigenConvolutionKernel2D runs up to 10% faster (https://bitbucket.org/eigen/eigen/src/753ceee5f206ff7dde9f6a41a5a420749fc9406f/unsupported/Eigen/CXX11/src/Tensor/TensorConvolution.h?at=default#cl-605). I have some difficulties writing small tests that benefit from this reordering due to a seemingly issue with LSR (being discussed at http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088244.html). See the review thread for the compilation time impact of GVN. Reviewers: eliben, jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11304 llvm-svn: 242982
*	[BranchFolding] do not iterate the aliases of virtual registers	Jingyue Wu	2015-07-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MCRegAliasIterator only works for physical registers. So, do not run it on virtual registers. With this issue fixed, we can resurrect the BranchFolding pass in NVPTX backend. Reviewers: jholewinski, bkramer Subscribers: henryhu, meheff, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11174 llvm-svn: 242871
*	[NVPTX] make load on global readonly memory to use ldg	Jingyue Wu	2015-07-20	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt_kernel and rwdot_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713
*	Use inbounds GEPs for memcpy and memset lowering	Eli Bendersky	2015-07-17	1	-8/+10
\| \| \| \| \| \|	Follow-up on discussion in http://reviews.llvm.org/D11220 llvm-svn: 242542
*	Streamline the coding style in NVPTXLowerAggrCopies	Eli Bendersky	2015-07-16	1	-111/+127
\| \| \| \| \| \|	Make the style consistent with LLVM style throughout and clang-format. llvm-svn: 242439
*	[NVPTX] enable SpeculativeExecution in NVPTX	Jingyue Wu	2015-07-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SpeculativeExecution enables a series straight line optimizations (such as SLSR and NaryReassociate) on conditional code. For example, if (...) ... b * s ... if (...) ... (b + 1) * s ... speculative execution can hoist b * s and (b + 1) * s from then-blocks, so that we have ... b * s ... if (...) ... ... (b + 1) * s ... if (...) ... Then, SLSR can rewrite (b + 1) * s to (b * s + s) because after speculative execution b * s dominates (b + 1) * s. The performance impact of this change is significant. It speeds up the benchmarks running EigenFloatContractionKernelInternal16x16 (https://bitbucket.org/eigen/eigen/src/ba68f42fa69e4f43417fe1e52669d4dd5d2b3bee/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h?at=default#cl-526) by roughly 2%. Some internal benchmarks that have the above code pattern are improved by up to 40%. No significant slowdowns are observed on Eigen CUDA microbenchmarks. Reviewers: jholewinski, broune, eliben Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11201 llvm-svn: 242437
*	[NVPTX] Don't leak dead instructions after unlinking them from the BasicBlock	Benjamin Kramer	2015-07-16	1	-2/+2
\| \| \| \|	llvm-svn: 242417
*	Correct lowering of memmove in NVPTX	Eli Bendersky	2015-07-16	2	-61/+168
\| \| \| \| \| \| \| \| \| \|	This fixes https://llvm.org/bugs/show_bug.cgi?id=24056 Also a bit of refactoring along the way. Differential Revision: http://reviews.llvm.org/D11220 llvm-svn: 242413
*	Move most user of TargetMachine::getDataLayout to the Module one	Mehdi Amini	2015-07-16	1	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. This patch is quite boring overall, except for some uglyness in ASMPrinter which has a getDataLayout function but has some clients that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so some methods are taking a DataLayout as parameter. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11090 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242386
*	Remove DataLayout from TargetLoweringObjectFile, redirect to Module	Mehdi Amini	2015-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11079 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242385
*	Enable partial and runtime loop unrolling for NVPTX.	Mark Heffernan	2015-07-13	2	-0/+14
\| \| \| \| \| \| \| \| \|	Enable partial and runtime loop unrolling for NVPTX backend via TTI::UnrollingPreferences with a small threshold. This partially unrolls small loops which are often unrolled by the PTX to SASS compiler and unrolling earlier can be beneficial. llvm-svn: 242049
*	MC: Remove MCSubtargetInfo() default constructor	Duncan P. N. Exon Smith	2015-07-10	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Force all creators of `MCSubtargetInfo` to immediately initialize it, merging the default constructor and the initializer into an initializing constructor. Besides cleaning up the code a little, this makes it clear that the initializer is never called again later. Out-of-tree backends need a trivial change: instead of calling: auto *X = new MCSubtargetInfo(); InitXYZMCSubtargetInfo(X, ...); return X; they should call: return createXYZMCSubtargetInfoImpl(...); There's no real functionality change here. llvm-svn: 241957
*	[TTI] BasicTTIImpl assumes no vector registers	Jingyue Wu	2015-07-10	2	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Following the discussion on r241884, it's more reasonable to assume that a target has no vector registers by default instead of letting every such target overrides getNumberOfRegisters. Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to return 0 when Vector is true, and partially reverts r241884 which modifies NVPTXTTIImpl::getNumberOfRegisters. It also fixes a performance bug in LoopVectorizer. Even if a target has no vector registers, vectorization may still help ILP. So, we need both checks to be false before disabling loop vectorization all together. Reviewers: hfinkel Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11108 llvm-svn: 241942
*	Actually support volatile memcpys in NVPTX lowering	Eli Bendersky	2015-07-10	1	-8/+10
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D11091 llvm-svn: 241914
*	[NVPTX] declare no vector registers	Jingyue Wu	2015-07-10	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll) produces code with complex control flow which hurts later optimizations. Since NVPTX doesn't have vector registers in LLVM's sense (NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector registers to effectively disable loop vectorization. Reviewers: jholewinski Subscribers: jingyue, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11089 llvm-svn: 241884
*	Replace index-loops by range-based loops	Eli Bendersky	2015-07-09	1	-6/+3
\| \| \| \| \| \|	NFC llvm-svn: 241875
*	Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user	Mehdi Amini	2015-07-09	1	-1/+1
\| \| \| \| \| \| \|	A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807
*	Remove getDataLayout() from TargetSelectionDAGInfo (had no users)	Mehdi Amini	2015-07-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780
*	Remove getDataLayout() from TargetLowering	Mehdi Amini	2015-07-09	2	-35/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779