bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/SI: Fix s_waitcnt insertion for flat instructions	Tom Stellard	2016-02-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 llvm-svn: 261330
*	Add support for merging strings with alignment larger than one char.	Rafael Espindola	2016-02-19	1	-8/+16
\| \| \| \| \| \|	This will be used in a lld patch. llvm-svn: 261326
*	[SystemZ] Fix ABI for i128 argument and return types	Ulrich Weigand	2016-02-19	4	-10/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the SystemZ ABI, 128-bit integer types should be passed and returned via implicit reference. However, this is not currently implemented at the LLVM IR level for the i128 type. This does not matter when compiling C/C++ code, since clang will implement the implicit reference itself. However, it turns out that when calling libgcc helper routines operating on 128-bit integers, LLVM will use i128 argument and return value types; the resulting code is not compatible with the ABI used in libgcc, leading to crashes (see PR26559). This should be simple to fix, except that i128 currently is not even a legal type for the SystemZ back end. Therefore, common code will already split arguments and return values into multiple parts. The bulk of this patch therefore consists of detecting such parts, and correctly handling passing via implicit reference of a value split into multiple parts. If at some time in the future, i128 becomes a legal type, this code can be removed again. This fixes PR26559. llvm-svn: 261325
*	[LPM] Factor all of the loop analysis usage updates into a common helper	Chandler Carruth	2016-02-19	12	-171/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316
*	[X86] Remove unused entries from the disassembler type enum.	Craig Topper	2016-02-19	3	-6/+0
\| \| \| \|	llvm-svn: 261311
*	Shuffle header file as per the Coding Standards	David Majnemer	2016-02-19	1	-1/+1
\| \| \| \|	llvm-svn: 261308
*	[SjLjEHPrepare] Simplify/cleanup code	David Majnemer	2016-02-19	1	-64/+50
\| \| \| \| \| \|	No functional change is intended. llvm-svn: 261307
*	LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs	Matthias Braun	2016-02-19	1	-5/+31
\| \| \| \|	llvm-svn: 261306
*	Add profile summary support for sample profile.	Easwaran Raman	2016-02-19	5	-9/+128
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17178 llvm-svn: 261304
*	[SjLjEHPrepare] Don't grab pointers to functions in doInitialization	David Majnemer	2016-02-19	1	-18/+17
\| \| \| \| \| \| \| \| \| \|	Certain optimization passes (like globaldce) can prune function declaration that SjLjEHPrepare assumed would exit when it'd runOnFunction. This fixes PR26669. llvm-svn: 261303
*	[AA] Preserve the AA results wrapper pass as well as BasicAA in a few	Chandler Carruth	2016-02-19	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302
*	Bug fix: use dyn_cast_or_null instead of dyn_cast	Lawrence Hu	2016-02-19	1	-2/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17154 llvm-svn: 261299
*	Minor code cleanups. NFC.	Junmo Park	2016-02-19	1	-3/+3
\| \| \| \|	llvm-svn: 261294
*	When printing MIR, output to errs() rather than outs().	Justin Lebar	2016-02-19	2	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without this, this command $ llvm-run llc -stop-after machine-cp -o - <( echo '' ) outputs an error, because we close stdout twice -- once when closing the file opened for "-o", and again when closing outs(). Also clarify in the outs() definition that you can't ever call it if you want to open your own raw_fd_ostream on stdout. Reviewers: jroelofs, tstellarAMD Subscribers: jholewinski, qcolombet, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17422 llvm-svn: 261286
*	[IR] Extend cmpxchg to allow pointer type operands	Philip Reames	2016-02-19	3	-13/+69
\| \| \| \| \| \| \| \| \| \| \| \|	Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives: 1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative) 2) It pushes work onto the frontend authors for no real gain This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes. Differential Revision: http://reviews.llvm.org/D17413 llvm-svn: 261281
*	[x86] fix initialization of PredictableSelectIsExpensive	Sanjay Patel	2016-02-18	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This is effectively NFC because Atom is the only in-order x86 subtarget currently, but the predicate would have become wrong if any other in-order CPU came along. See related discussion in: http://reviews.llvm.org/D16836 llvm-svn: 261275
*	Remove uses of builtin comma operator.	Richard Trieu	2016-02-18	33	-173/+311
\| \| \| \| \| \|	Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
*	[libFuzzer] only read MaxLen bytes from every file in the corpus to speedup ↵	Kostya Serebryany	2016-02-18	4	-12/+18
\| \| \| \| \| \|	loading the corpus llvm-svn: 261267
*	[PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC	Adam Nemet	2016-02-18	5	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265
*	[PPCLoopDataPrefetch] Remove PPC from some of the names. NFC	Adam Nemet	2016-02-18	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \|	This is done only to make the next patch that move the pass out PPC to Transforms easier to read. After this most line should show up as moved lines in that patch. This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). llvm-svn: 261264
*	[WinEH] Hoist state stores from successors	David Majnemer	2016-02-18	1	-1/+54
\| \| \| \| \| \| \| \| \| \|	If we know that all of our successors want to be in the exact same state, it makes sense to hoist the state transition into their common predecessor. Differential Revision: http://reviews.llvm.org/D17391 llvm-svn: 261262
*	[X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC.	Davide Italiano	2016-02-18	1	-1/+1
\| \| \| \|	llvm-svn: 261255
*	Add support for invoke/landingpad/resume in C API test	Amaury Sechet	2016-02-18	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As per title. There was a lot of part missing in the C API, so I had to extend the invoke and landingpad API. Reviewers: echristo, joker.eph, Wallbraker Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17359 llvm-svn: 261254
*	Restrict scope of variables [NFC]	Philip Reames	2016-02-18	1	-2/+2
\| \| \| \|	llvm-svn: 261250
*	[CaptureTracking] Support atomicrmw and cmpxchg	Philip Reames	2016-02-18	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	These atomic operations are conceptually both a load and store from the same location. As such, we can treat them as the most conservative of those two components which in practice, means we can treat them like stores. An cmpxchg or atomicrmw captures the values, but not the locations accessed. Note: We can probably be more aggressive about the comparison value in an cmpxhg since to have it be in memory, it must already be captured, but I figured it was better to avoid that for the moment. Note 2: It turns out that since we don't actually support cmpxchg of pointer type, writing a negative test is impossible. Differential Revision: http://reviews.llvm.org/D17400 llvm-svn: 261245
*	[DebugInfoPDB] Add source / line number accessors for PDB.	Zachary Turner	2016-02-18	3	-3/+93
\| \| \| \| \| \| \|	This patch adds a variety of different methods to query source and line number information from PDB files. llvm-svn: 261239
*	[AArch64] Reduce vector insert/extract cost for Kryo	Matthew Simpson	2016-02-18	1	-0/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17379 llvm-svn: 261237
*	Revert to extend i8/i16 return values on Darwin (PR26665)	Hans Wennborg	2016-02-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	In r260133, LLVM was changed to no longer extend i8/i16 return values, as it's not required by the ABI. However, code was found in the wild that relies on the old behaviour on Darwin, so this commit reverts back to that old behaviour for Darwin. On other platforms, it's less likely that code would be depending on the old behaviour, as GCC and MSVC haven't been extending such return values. llvm-svn: 261235
*	Make header self-contained. NFC.	Benjamin Kramer	2016-02-18	1	-0/+1
\| \| \| \|	llvm-svn: 261234
*	[Hexagon] Remove redundant check.	Chad Rosier	2016-02-18	1	-2/+2
\| \| \| \|	llvm-svn: 261232
*	Stop creating covmap as note section on ELF	Xinliang David Li	2016-02-18	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \|	covmap needs to created as non allocatable, but not with SHT_NOTE. The latter was needed to workaround a problem of BFD linker with gc, which is no longer needed. (A more proper longer term fix requires changing FE driver to force referencing the section using linker script). Differential Revision: http://reviews.llvm.org/D17309 llvm-svn: 261228
*	AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsics	Nicolai Haehnle	2016-02-18	3	-30/+75
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 llvm-svn: 261224
*	[Hexagon] Fix compilation error with GCC 6	Krzysztof Parzyszek	2016-02-18	1	-66/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiling Hexagon target with GCC 6 produces "error: should have been declared inside" due to GCC PR c++/69657 which was merged. Properly wrapping operator<<() definitions within the namespace llvm fixes the issue. Author: domagoj.stolfa Differential Revision: http://reviews.llvm.org/D17281 llvm-svn: 261220
*	[Hexagon] Implement TLS support	Krzysztof Parzyszek	2016-02-18	5	-2/+202
\| \| \| \| \| \|	Patch by Anand Kodnani. llvm-svn: 261218
*	Reapply commit r259357 with a fix for PR26629	Matthew Simpson	2016-02-18	1	-12/+237
\| \| \| \| \| \| \| \| \| \|	Commit r259357 was reverted because it caused PR26629. We were assuming all roots of a vectorizable tree could be truncated to the same width, which is not the case in general. This commit reapplies the patch along with a fix and a new test case to ensure we don't regress because of this issue again. This should fix PR26629. llvm-svn: 261212
*	[mips][microMIPS] Implement TLBINV and TLBINVF instructions	Zlatko Buljan	2016-02-18	3	-2/+30
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16849 llvm-svn: 261211
*	[Hexagon] Add support for __builtin_prefetch	Krzysztof Parzyszek	2016-02-18	3	-0/+38
\| \| \| \|	llvm-svn: 261210
*	[Hexagon] Update the callee-saved register set for EH-aware functions	Krzysztof Parzyszek	2016-02-18	1	-3/+15
\| \| \| \|	llvm-svn: 261208
*	[PM] Port the PostOrderFunctionAttrs pass to the new pass manager and	Chandler Carruth	2016-02-18	6	-40/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	convert one test to use this. This is a particularly significant milestone because it required a working per-function AA framework which can be queried over each function from within a CGSCC transform pass (and additionally a module analysis to be accessible). This is essentially the point of the entire pass manager rewrite. A CGSCC transform is able to query for multiple different function's analysis results. It works. The whole thing appears to actually work and accomplish the original goal. While we were able to hack function attrs and basic-aa to "work" in the old pass manager, this port doesn't use any of that, it directly leverages the new fundamental functionality. For this to work, the CGSCC framework also has to support SCC-based behavior analysis, etc. The only part of the CGSCC pass infrastructure not sorted out at this point are the updates in the face of inlining and running function passes that mutate the call graph. The changes are pretty boring and boiler-plate. Most of the work was factored into more focused preperatory patches. But this is what wires it all together. llvm-svn: 261203
*	[X86][SSE] Improve PSHUFB shuffle mask decoding.	Simon Pilgrim	2016-02-18	1	-16/+36
\| \| \| \| \| \| \| \|	In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool. The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly. llvm-svn: 261201
*	Minor code cleanup. NFC.	Junmo Park	2016-02-18	1	-1/+1
\| \| \| \|	llvm-svn: 261200
*	Test commit access.	Nikolay Haustov	2016-02-18	1	-1/+0
\| \| \| \|	llvm-svn: 261199
*	[AVX512][PRORQ][PRORD] Change imm8 to int	Michael Zuckerman	2016-02-18	1	-6/+6
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17024 llvm-svn: 261198
*	[PM/AA] Teach the new pass manager to use pass-by-lambda for registering	Chandler Carruth	2016-02-18	2	-4/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	analysis passes, support pre-registering analyses, and use that to implement parsing and pre-registering a custom alias analysis pipeline. With this its possible to configure the particular alias analysis pipeline used by the AAManager from the commandline of opt. I've updated the test to show this effectively in use to build a pipeline including basic-aa as part of it. My big question for reviewers are around the APIs that are used to expose this functionality. Are folks happy with pass-by-lambda to do pass registration? Are folks happy with pre-registering analyses as a way to inject customized instances of an analysis while still using the registry for the general case? Other thoughts of course welcome. The next round of patches will be to add the rest of the alias analyses into the new pass manager and wire them up here so that they can be used from opt. This will require extending the (somewhate limited) functionality of AAManager w.r.t. module passes. Differential Revision: http://reviews.llvm.org/D17259 llvm-svn: 261197
*	[WebAssembly] Don't use setRequiresStructuredCFG(true).	Dan Gohman	2016-02-18	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	While we still do want reducible control flow, the RequiresStructuredCFG flag imposes more strict structure constraints than WebAssembly wants. Unsetting this flag enables critical edge splitting and tail merging. Also, disable TailDuplication explicitly, as it doesn't support virtual registers, and was previously only disabled by the RequiresStructuredCFG flag. llvm-svn: 261190
*	Revert "LiveIntervalAnalysis: Remove LiveVariables requirement" and ↵	Matthias Braun	2016-02-18	3	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LiveIntervalTest The commit breaks stage2 compilation on PowerPC. Reverting for now while this is analyzed. I also have to revert the LiveIntervalTest for now as that depends on this commit. Revert "LiveIntervalAnalysis: Remove LiveVariables requirement" This reverts commit r260806. Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning." This reverts commit r260931. Revert "Fix typo in LiveIntervalTest" This reverts commit r260907. Revert "Add unittest for LiveIntervalAnalysis::handleMove()" This reverts commit r260905. llvm-svn: 261189
*	[AMDGPU] Disassembler: Added basic disassembler for AMDGPU target	Tom Stellard	2016-02-18	12	-49/+551
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes: - Added disassembler project - Fixed all decoding conflicts in .td files - Added DecoderMethod=“NONE” option to Target.td that allows to disable decoder generation for an instruction. - Created decoding functions for VS_32 and VReg_32 register classes. - Added stubs for decoding all register classes. - Added several tests for disassembler Disassembler only supports: - VI subtarget - VOP1 instruction encoding - 32-bit register operands and inline constants [Valery] One of the point that requires to pay attention to is how decoder conflicts were resolved: - Groups of target instructions were separated by using different DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate approach. - There were conflicts in IMAGE_<> instructions caused by two different reasons: 1. dmask wasn’t specified for the output (fixed) 2. There are image instructions that differ only by the number of the address components but have the same encoding by the HW spec. The actual number of address components is determined by the HW at runtime using image resource descriptor starting from the VGPR encoded in an IMAGE instruction. This means that we should choose only one instruction from conflicting group to be the rule for decoder. I didn’t find the way to disable decoder generation for an arbitrary instruction and therefore made a onelinear fix to tablegen generator that would suppress decoder generation when DecoderMethod is set to “NONE”. This is a change that should be reviewed and submitted first. Otherwise I would need to specify different DecoderNamespace for every instruction in the conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not used in other targets. 3. IMAGE_GATHER decoder generation is for now disabled and to be done later. [/Valery] Patch By: Sam Kolton Differential Revision: http://reviews.llvm.org/D16723 llvm-svn: 261185
*	[libFuzzer] fix the libFuzzer bot	Kostya Serebryany	2016-02-18	2	-2/+2
\| \| \| \|	llvm-svn: 261184
*	[WebAssembly] Disable register stackification and coloring when not optimizing	Derek Schuff	2016-02-17	3	-11/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These passes are optimizations, and should be disabled when not optimizing. Also create an MCCodeGenInfo so the opt level is correctly plumbed to the backend pass manager. Also remove the command line flag for disabling register coloring; running llc with -O0 should now be useful for debugging, so it's not necessary. Differential Revision: http://reviews.llvm.org/D17327 llvm-svn: 261176
*	AArch64: always clear kill flags up to last eliminated copy	Tim Northover	2016-02-17	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After r261154, we were only clearing flags if the known-zero register was originally live-in to the basic block, but we have to do it even if not when more than one COPY has been eliminated, otherwise the user of the first COPY may still have <kill> marked. E.g. BB#N: %X0 = COPY %XZR STRXui %X0<kill>, <fi#0> %X0 = COPY %XZR STRXui %X0<kill>, <fi#1> We can eliminate both copies, X0 is not live-in, but we must clear the kill on the first store. Unfortunately, I've been unable to come up with a non-fragile test for this. I've only seen it in the wild with regalloc-created spills, and attempts to reproduce that in a reasonable way run afoul of COPY coalescing. Even volatile asm clobbers were moved around. Should fix the aarch64 bot though. llvm-svn: 261175