bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][SSE] Added target shuffle combine binary compute matching function. NFCI.	Simon Pilgrim	2016-08-05	1	-72/+80
\| \| \| \| \| \|	Added matchBinaryPermuteVectorShuffle and moved the blend+zero and insertps matching code into it. llvm-svn: 277808
*	Reapply r276973 "Adjust Registry interface to not require plugins to export ↵	John Brawn	2016-08-05	2	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a registry" This differs from the previous version by being more careful about template instantiation/specialization in order to prevent errors when building with clang -Werror. Specifically: * begin is not defined in the template and is instead instantiated when Head is. I think the warning when we don't do that is wrong (PR28815) but for now at least do it this way to avoid the warning. * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY instead provide a template definition then do explicit instantiation. No compiler I've tried has problems with doing it the other way, but strictly speaking it's not permitted by the C++ standard so better safe than sorry. Original commit message: Currently the Registry class contains the vestiges of a previous attempt to allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a plugin would have its own copy of a registry and export it to be imported by the tool that's loading the plugin. This only works if the plugin is entirely self-contained with the only interface between the plugin and tool being the registry, and in particular this conflicts with how IR pass plugins work. This patch changes things so that instead the add_node function of the registry is exported by the tool and then imported by the plugin, which solves this problem and also means that instead of every plugin having to export every registry they use instead LLVM only has to export the add_node functions. This allows plugins that use a registry to work on Windows if LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used. llvm-svn: 277806
*	[PowerPC] fix passing long double arguments to function (soft-float)	Strahinja Petrovic	2016-08-05	3	-0/+39
\| \| \| \| \| \| \| \| \| \|	This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804
*	[InstCombine] try to fold (select C, (sext A), B) into logical ops	Nicolai Haehnle	2016-08-05	2	-1/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and B is a compatible constant, also for zext instead of sext. This will then be further folded into logical operations. The transformation would be valid for non-i1 types as well, but other parts of InstCombine prefer to have sext from non-i1 as an operand of select. Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32 for boolean operations. With this change, the boolean logic is fully recovered. Reviewers: majnemer, spatel, tstellarAMD Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22747 llvm-svn: 277801
*	InstCombine: Clean up some trailing whitespace. NFC	Justin Bogner	2016-08-05	4	-13/+13
\| \| \| \|	llvm-svn: 277793
*	InstCombine: Replace some never-null pointers with references. NFC	Justin Bogner	2016-08-05	13	-104/+102
\| \| \| \|	llvm-svn: 277792
*	GVN-hoist: enable by default	Sebastian Pop	2016-08-04	1	-2/+2
\| \| \| \|	llvm-svn: 277786
*	GVN-hoist: fix early exit logic	Sebastian Pop	2016-08-04	1	-4/+11
\| \| \| \| \| \| \| \| \| \|	The patch splits a complex && if condition into easier to read and understand logic. That wrong early exit condition was letting some instructions with not all operands available pass through when HoistingGeps was true. Differential Revision: https://reviews.llvm.org/D23174 llvm-svn: 277785
*	IR: Provide an IRBuilder Inserter that calls a callback after insertion	Justin Bogner	2016-08-04	2	-25/+10
\| \| \| \| \| \| \| \| \| \|	Add a generalized IRBuilderCallbackInserter, which is just given a callback to execute after insertion. This can be used to get rid of the custom inserter in InstCombine, which will in turn allow me to add target specific InstCombineCalls API for intrinsics without horrible layering violations. llvm-svn: 277784
*	[LV, X86] Be more optimistic about vectorizing shifts.	Michael Kuperstein	2016-08-04	4	-24/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782
*	[InstCombine] use m_APInt to allow icmp eq (mul X, C1), C2 folds for splat ↵	Sanjay Patel	2016-08-04	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	constant vectors This concludes the splat vector enhancements for foldICmpEqualityWithConstant(). Other commits in this series: https://reviews.llvm.org/rL277762 https://reviews.llvm.org/rL277752 https://reviews.llvm.org/rL277738 https://reviews.llvm.org/rL277731 https://reviews.llvm.org/rL277659 https://reviews.llvm.org/rL277638 https://reviews.llvm.org/rL277629 llvm-svn: 277779
*	Clean up the logic of the Archive::Child::Child() with an assert to know Err ↵	Kevin Enderby	2016-08-04	1	-21/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is not a nullptr when we are pointed at real data. David Blaikie pointed out some odd logic in the case the Err value was a nullptr and Lang Hames suggested it could be cleaned it up with an assert to know that Err is not a nullptr when we are pointed at real data. As only in the case of constructing the sentinel value by pointing it at null data is Err is permitted to be a nullptr, since no error could occur in that case. With this change the testing for “if (Err)” is removed from the constructor’s logic and *Err is used directly without any check after the assert(). llvm-svn: 277776
*	GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR.	Tim Northover	2016-08-04	2	-3/+9
\| \| \| \| \| \| \|	These are the operations that are trivially identical. Division is omitted for now because you need to use the correct sign/zero extension. llvm-svn: 277775
*	GlobalISel: add support for G_MUL	Tim Northover	2016-08-04	1	-0/+2
\| \| \| \|	llvm-svn: 277774
*	GlobalISel: implement narrowing for G_ADD.	Tim Northover	2016-08-04	3	-2/+57
\| \| \| \|	llvm-svn: 277769
*	GVNHoist: Don't hoist convergent calls	Matt Arsenault	2016-08-04	1	-0/+4
\| \| \| \|	llvm-svn: 277767
*	[ExecutionEngine] Refactor - Roll JITSymbolFlags functionality into JITSymbol.h	Lang Hames	2016-08-04	4	-4/+49
\| \| \| \| \| \|	and remove the JITSymbolFlags header. llvm-svn: 277766
*	[coroutines] Part 4[ab]: Coroutine Devirtualization: Lower coro.resume and ↵	David Majnemer	2016-08-04	5	-6/+225
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coro.destroy. This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume and coro.destroy intrinsics by replacing them with an indirect call to an address returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate function address. Patch by Gor Nishanov! Differential Revision: https://reviews.llvm.org/D22998 llvm-svn: 277765
*	[InstCombine] use m_APInt to allow icmp eq (and X, C1), C2 folds for splat ↵	Sanjay Patel	2016-08-04	1	-14/+9
\| \| \| \| \| \|	constant vectors llvm-svn: 277762
*	[OpenCL] Add missing tests for getOCLTypeName	Yaxun Liu	2016-08-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22964 llvm-svn: 277759
*	[CodeView] Use llvm::Error instead of std::error_code.	Zachary Turner	2016-08-04	3	-103/+126
\| \| \| \| \| \| \|	This eliminates the remnants of std::error_code from the DebugInfo libraries. llvm-svn: 277758
*	AArch64: don't assume all i128s are BUILD_PAIRs	Tim Northover	2016-08-04	1	-6/+13
\| \| \| \| \| \| \|	It leads to a crash when they're not. I'm sure I've made this mistake before, at least once. llvm-svn: 277755
*	[InstCombine] use m_APInt to allow icmp eq (or X, C1), C2 folds for splat ↵	Sanjay Patel	2016-08-04	1	-9/+7
\| \| \| \| \| \|	constant vectors llvm-svn: 277752
*	GlobalISel: also add G_TRUNC to IRTranslator.	Tim Northover	2016-08-04	1	-1/+3
\| \| \| \|	llvm-svn: 277749
*	GlobalISel: add code to widen scalar G_ADD	Tim Northover	2016-08-04	3	-1/+39
\| \| \| \|	llvm-svn: 277747
*	[WebAssembly] Check return value of getRegForValue in FastISel	Derek Schuff	2016-08-04	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, FastISel for WebAssembly wasn't checking the return value of `getRegForValue` in certain cases, which would generate instructions referencing NoReg. This patch fixes this behavior. Patch by Dominic Chen Differential Revision: https://reviews.llvm.org/D23100 llvm-svn: 277742
*	[Hexagon] Validate register class when doing bit simplification	Krzysztof Parzyszek	2016-08-04	1	-10/+33
\| \| \| \|	llvm-svn: 277740
*	[InstCombine] use m_APInt to allow icmp eq (op X, Y), C folds for splat ↵	Sanjay Patel	2016-08-04	2	-18/+6
\| \| \| \| \| \| \| \| \|	constant vectors I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well, so we know where those folds are happening in InstSimplify. llvm-svn: 277738
*	[X86][SSE] Rename target shuffle unary permute matching function. NFCI.	Simon Pilgrim	2016-08-04	1	-6/+6
\| \| \| \| \| \|	In preparation for adding a binary permute matching function. llvm-svn: 277737
*	LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack ↵	Alina Sbirlea	2016-08-04	3	-14/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	adjustments. Summary: TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses. A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted. Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures: - x86 failing tests either did not have the right target, or the right alignment. - NVPTX failing tests did not have the right alignment. - AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks for a maximum size allowed and relies on the next condition checking for %4 for correctness. This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list). Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure), so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context. Reviewers: arsenm, jlebar, tstellarAMD Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23068 llvm-svn: 277735
*	[mips] Set Personality and LSDA encoding for FreeBSD	Daniel Sanders	2016-08-04	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	Reviewers: seanbruno, sdardis Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits, seanbruno Differential Revision: https://reviews.llvm.org/D23113 llvm-svn: 277732
*	[InstCombine] use m_APInt to allow icmp eq (sub C1, X), C2 folds for splat ↵	Sanjay Patel	2016-08-04	1	-4/+4
\| \| \| \| \| \|	constant vectors llvm-svn: 277731
*	[X86][SSE] Split off shuffle mask canonicalization from lowerVectorShuffle. ↵	Simon Pilgrim	2016-08-04	1	-52/+67
\| \| \| \| \| \| \| \| \| \|	NFCI. The new function now returns true if the shuffle should be commuted. This will allow target shuffle combines to share the code. llvm-svn: 277728
*	[Hexagon] Clear kill flags from modified registers in peephole optimizer	Krzysztof Parzyszek	2016-08-04	1	-1/+4
\| \| \| \|	llvm-svn: 277727
*	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation	Nikolai Bozhenov	2016-08-04	9	-7/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 llvm-svn: 277725
*	[mips][microMIPS] Implement CFC1, CFC2, CTC1 and CTC2 instructions	Hrvoje Varga	2016-08-04	5	-7/+35
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22347 llvm-svn: 277719
*	[X86][SSE] Add initial costs for vector CTTZ/CTLZ	Simon Pilgrim	2016-08-04	1	-4/+41
\| \| \| \|	llvm-svn: 277716
*	[X86][SSE] Don't decide when to scalarize CTTZ/CTLZ for performance at ↵	Simon Pilgrim	2016-08-04	1	-12/+4
\| \| \| \| \| \| \| \|	lowering - this is what cost models are for Improved CTTZ/CTLZ costings will be added shortly llvm-svn: 277713
*	[mips] Enable tail calls by default	Simon Dardis	2016-08-04	8	-10/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable tail calls by default for (micro)MIPS(64). microMIPS is slightly more tricky than doing it for MIPS(R6) or microMIPSR6. microMIPS has two instruction encodings: 16bit and 32bit along with some restrictions on the size of the instruction that can fill the delay slot. For safe tail calls for microMIPS, the delay slot filler attempts to find a correct size instruction for the delay slot of TAILCALL pseudos. Reviewers: dsanders, vkalintris Subscribers: jfb, dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D21138 llvm-svn: 277708
*	Typo fix in comment. NFC	Diana Picus	2016-08-04	1	-1/+1
\| \| \| \|	llvm-svn: 277704
*	[XRay] Align entry and return sleds to 2 byte boundaries	Dean Michael Berris	2016-08-04	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should ensure that we can atomically write two bytes (on top of the retq and the one past it) and have those two bytes not straddle cache lines. We also move the label past the alignment instruction so that we can refer to the actual first instruction, as opposed to potential padding before the aligned instruction. Update the tests to allow us to reflect the new order of assembly. Reviewers: rSerge, echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23101 llvm-svn: 277701
*	Add popcount(n) == bitsize(n) -> n == -1 transformation.	Amaury Sechet	2016-08-04	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As per title. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23139 llvm-svn: 277694
*	Forgot the dyn_cast_or_null intended for r277691.	David Majnemer	2016-08-04	1	-1/+1
\| \| \| \|	llvm-svn: 277693
*	Reinstate "[CloneFunction] Don't remove side effecting calls"	David Majnemer	2016-08-04	2	-4/+37
\| \| \| \| \| \| \|	This reinstates r277611 + r277614 and reverts r277642. A cast_or_null should have been a dyn_cast_or_null. llvm-svn: 277691
*	Revert "GVN-hoist: enable by default" & "Make GVN Hoisting obey optnone/bisect."	Bruno Cardoso Lopes	2016-08-04	1	-2/+2
\| \| \| \| \| \| \| \|	This reverts commits r277685 & r277688. r277685 broke compiler-rt compilation http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/23335 and r277685 is a followup from it. llvm-svn: 277690
*	[PM] Change the name of the repeating utility to something less	Chandler Carruth	2016-08-04	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	overloaded (and simpler). Sean rightly pointed out in code review that we've started using "wrapper pass" as a specific part of the old pass manager, and in fact it is more applicable there. Here, we really have a pass template to build a repeated pass, so call it that. llvm-svn: 277689
*	GVN-hoist: enable by default	Sebastian Pop	2016-08-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	As we addressed all compilation time problems with GVN-hoist https://llvm.org/bugs/show_bug.cgi?id=28670 this patch turns GVN-hoist back by default. Differential Revision: https://reviews.llvm.org/D23136 llvm-svn: 277685
*	pdbdump: Fix crash bug.	Rui Ueyama	2016-08-03	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit without calling DbiStreamBuilder::finalize. Because `finalize` initializes `Header` member, `Header` remained nullptr which caused a crash bug. Differential Revision: https://reviews.llvm.org/D23143 llvm-svn: 277681
*	RenameIndependentSubregs: Fix liveness query in rewriteOperands()	Matthias Braun	2016-08-03	1	-7/+6
\| \| \| \| \| \| \| \|	rewriteOperands() always performed liveness queries at the base index rather than the RegSlot/Base as apropriate for the machine operand. This could lead to illegal rewriting in some cases. llvm-svn: 277661
*	[InstCombine] use m_APInt to allow icmp eq (add X, C1), C2 folds for splat ↵	Sanjay Patel	2016-08-03	1	-6/+8
\| \| \| \| \| \|	constant vectors llvm-svn: 277659