bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles	Sanjay Patel	2015-03-12	2	-18/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should complete the job started in r231794 and continued in r232045: We want to replace as much custom x86 shuffling via intrinsics as possible because pushing the code down the generic shuffle optimization path allows for better codegen and less complexity in LLVM. AVX2 introduced proper integer variants of the hacked integer insert/extract C intrinsics that were created for this same functionality with AVX1. This should complete the removal of insert/extract128 intrinsics. The Clang precursor patch for this change was checked in at r232109. llvm-svn: 232120
*	Removed useless palignr test - we don't actually provide a ↵	Simon Pilgrim	2015-03-12	1	-28/+0
\| \| \| \| \| \| \| \|	llvm.x86.ssse3.palign.r.128 intrinsic Differential Revision: http://reviews.llvm.org/D8302 llvm-svn: 232108
*	R600/SI: Remove _e32 and _e64 suffixes from mnemonics	Tom Stellard	2015-03-12	3	-4/+4
\| \| \| \| \| \| \| \|	Instead print them as part of the $dst operand. The AsmMatcher requires the 32-bit and 64-bit encodings have the same mnemonic in order to parse them correctly. llvm-svn: 232105
*	Adding WinEHPrepare tests (currently XFAILs)	Andrew Kaylor	2015-03-12	2	-0/+472
\| \| \| \|	llvm-svn: 232104
*	Unxfail passing test on Hexagon	Krzysztof Parzyszek	2015-03-12	1	-1/+0
\| \| \| \| \| \|	test/CodeGen/Generic/2008-02-20-MatchingMem.ll llvm-svn: 232098
*	[X86] Fix a regression introduced by r223641.	Quentin Colombet	2015-03-12	2	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	The permps and permd instructions have their operands swapped compared to the intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP category. I did not create a new category for those two, as they are the only one AFAICT in that case. <rdar://problem/20108262> llvm-svn: 232085
*	Remove unused complex patterns for addressing modes on Hexagon.	Krzysztof Parzyszek	2015-03-12	1	-0/+2
\| \| \| \|	llvm-svn: 232057
*	[X86] Fix wrong target specific combine on SETCC nodes.	Andrea Di Biagio	2015-03-12	1	-0/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Part of the folding logic implemented by function 'PerformISDSETCCCombine' only worked under the assumption that the condition code in input could have been either SETNE or SETEQ. Unfortunately that assumption was incorrect, and in some cases the algorithm ended up incorrectly folding SETCC nodes. The incorrect folding only affected SETCC dag nodes where: - one of the operands was a build_vector of all zeroes; - the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements; - the condition code was neither SETNE nor SETEQ. Example: (setcc (v4i32 (sign_extend v4i1:%A)), (v4i32 VectorOfAllZeroes), setge) Before this patch, the entire dag node sequence from the example was incorrectly folded to node %A. With this patch, the dag node sequence is folded to a (xor %A, (v4i1 VectorOfAllOnes)). Added test setcc-combine.ll. Thanks to Greg Bedwell for spotting this issue. llvm-svn: 232046
*	[X86, AVX] replace vextractf128 intrinsics with generic shuffles	Sanjay Patel	2015-03-12	2	-24/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we've replaced the vinsertf128 intrinsics, do the same for their extract twins. This is very much like D8086 (checked in at r231794): We want to replace as much custom x86 shuffling via intrinsics as possible because pushing the code down the generic shuffle optimization path allows for better codegen and less complexity in LLVM. This is also the LLVM sibling to the cfe D8275 patch. Differential Revision: http://reviews.llvm.org/D8276 llvm-svn: 232045
*	[X86][AVX2] Added missing palignr stack folding test	Simon Pilgrim	2015-03-12	1	-2/+7
\| \| \| \|	llvm-svn: 232033
*	AVX-512: Added encoding tests for VPROR, VPROL instructions,	Elena Demikhovsky	2015-03-12	1	-0/+928
\| \| \| \| \| \|	fixed opcode. llvm-svn: 232018
*	Reapply 'Run LICM pass after loop unrolling pass.'	Kevin Qin	2015-03-12	1	-0/+43
\| \| \| \| \| \| \| \| \|	It's firstly committed at r231630, and reverted at r231635. Function pass InstructionSimplifier is inserted as barrier to make sure loop unroll pass won't affect on LICM pass. llvm-svn: 232011
*	[NVPTXAsmPrinter] do not print .align on function headers	Jingyue Wu	2015-03-12	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: PTX does not allow .align directives on function headers. Fixes PR21551. Test Plan: test/Codegen/NVPTX/function-align.ll Reviewers: eliben, jholewinski Reviewed By: eliben, jholewinski Subscribers: llvm-commits, eliben, jpienaar, jholewinski Differential Revision: http://reviews.llvm.org/D8274 llvm-svn: 232004
*	Remove some CHECK-NOT lines in favor of CHECK-NEXT	Reid Kleckner	2015-03-12	7	-61/+32
\| \| \| \| \| \|	NFC, this is just shorter. llvm-svn: 232000
*	Stop calling DwarfEHPrepare from WinEHPrepare	Reid Kleckner	2015-03-12	1	-1/+1
\| \| \| \| \| \| \| \|	Instead, run both EH preparation passes, and have them both ignore functions with unrecognized EH personalities. Pass delegation involved some hacky code for creating an AnalysisResolver that we don't need now. llvm-svn: 231995
*	Handle big index in getelementptr instruction	Reid Kleckner	2015-03-11	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CodeGen incorrectly ignores (assert from APInt) constant index bigger than 2^64 in getelementptr instruction. This is a test and fix for that. Patch by Paweł Bylica! Reviewed By: rnk Subscribers: majnemer, rnk, mcrosier, resistor, llvm-commits Differential Revision: http://reviews.llvm.org/D8219 llvm-svn: 231984
*	Extended support for native Windows C++ EH outlining	Andrew Kaylor	2015-03-11	7	-170/+611
\| \| \| \| \| \|	Differential Review: http://reviews.llvm.org/D7886 llvm-svn: 231981
*	Add the option, -info-plist to llvm-objdump used with -macho to print the	Kevin Enderby	2015-03-11	1	-0/+7
\| \| \| \| \| \|	Mach-O info plist section as strings. llvm-svn: 231974
*	[mips][microMIPS] Make usage of NOT16 by code generator	Jozef Kolek	2015-03-11	1	-0/+26
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D7748 llvm-svn: 231963
*	add CHECK-LABELs for better reliability	Sanjay Patel	2015-03-11	1	-16/+24
\| \| \| \|	llvm-svn: 231962
*	Put jump tables in unique sections on COFF.	Rafael Espindola	2015-03-11	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	If a function is going in an unique section (because of -ffunction-sections for example), putting a jump table in .rodata will keep .rodata alive and that will keep alive any other function that also has a jump table. Instead, put the jump table in a unique section that is associated with the function. llvm-svn: 231961
*	ARM: simplify and extend byval handling	Tim Northover	2015-03-11	11	-55/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main issue being fixed here is that APCS targets handling a "byval align N" parameter with N > 4 were miscounting what objects were where on the stack, leading to FrameLowering setting the frame pointer incorrectly and clobbering the stack. But byval handling had grown over many years, and had multiple layers of cruft trying to compensate for each other and calculate padding correctly. This only really needs to be done once, in the HandleByVal function. Elsewhere should just do what it's told by that call. I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits byvals with the correct C ABI alignment), which simplified HandleByVal. rdar://20095672 llvm-svn: 231959
*	[dsymutil] Correctly clone address attributes.	Frederic Riss	2015-03-11	2	-0/+12
\| \| \| \| \| \| \|	DW_AT_low_pc on functions is taken care of by the relocation processing, but DW_AT_high_pc and DW_AT_low_pc on other lexical scopes need special handling. llvm-svn: 231955
*	InstCombine: Don't fold call bitcast into args if callee is byval	David Majnemer	2015-03-11	1	-0/+15
\| \| \| \| \| \| \|	This fixes a bug reported here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html llvm-svn: 231948
*	Add the "vbroadcasti128" instruction back.	Juergen Ributzka	2015-03-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	This is a follow-up to r231182. This adds the "vbroadcasti128" instruction back, but without the intrinsic mapping. Also add a test to check the instriction encoding. This is related to rdar://problem/18742778. llvm-svn: 231945
*	Make NaCl's use of .init_array for static constructors match Linux	Derek Schuff	2015-03-11	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The generic ELF TargetObjectFile defaults to .ctors, but Linux's defaults to .init_array by calling InitializeELF with the value of UseInitArray from TargetMachine. Make NaCl's behavior match. Reviewers: jvoung Differential Revision: http://reviews.llvm.org/D8240 llvm-svn: 231934
*	Inliner should not add callgraph edges for intrinsic calls (PR22857)	Sanjay Patel	2015-03-11	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics. This patch makes sure that the Inliner does not add such an edge to the callgraph. Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857 Differential Revision: http://reviews.llvm.org/D8231 llvm-svn: 231927
*	Prefer pipes over temporary files in a feeble attempt to stabilize this test ↵	Benjamin Kramer	2015-03-11	1	-3/+2
\| \| \| \| \| \|	on windows. llvm-svn: 231923
*	Relax CHECK to match mips syntax.	Rafael Espindola	2015-03-11	2	-2/+59
\| \| \| \|	llvm-svn: 231919
*	AVX-512: Added SKX forms of shift instructions.	Elena Demikhovsky	2015-03-11	3	-0/+2124
\| \| \| \| \| \| \|	Added rotation instructions, encoding only. Added encoding tests for all these forms. llvm-svn: 231916
*	Now that r231902's test is executed, make it actually pass	Justin Bogner	2015-03-11	2	-2/+2
\| \| \| \| \| \| \| \|	As of r231908, the test I added in r231902 actually gets run - but I'd checked in a stale version of the input so it didn't pass. Fix the input and un-xfail the test. llvm-svn: 231911
*	Fix another verifier crash where a GC intrinsic would look at the internals ↵	Owen Anderson	2015-03-11	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	of another intrinsic in order to verify itself. This causes a crash if the referenced intrinsic was malformed. In this case, we would already have reported an error on the referenced intrinsic, but then crashed on the second one when it tried to introspect the first without error checking. llvm-svn: 231910
*	Make test added in r231902 actually be executed.	Daniel Jasper	2015-03-11	2	-3/+3
\| \| \| \| \| \| \| \| \|	There were also errors in the CHECK line which I fixed and the test doesn't actually pass as the "100" is in the wrong line. Not sure whether this is a test failure or a coverage failure so making the test XFAIL for now. llvm-svn: 231908
*	Don't print labels that on ELF are never used.	Rafael Espindola	2015-03-11	2	-2/+0
\| \| \| \|	llvm-svn: 231904
*	InstrProf: Teach llvm-cov to handle universal binaries when given -arch	Justin Bogner	2015-03-11	3	-0/+17
\| \| \| \|	llvm-svn: 231902
*	Relax label CHECK to mach COFF syntax.	Rafael Espindola	2015-03-11	2	-1/+58
\| \| \| \| \| \| \| \|	Should fix the cygwin bots. I added a cygwin specific test that would have caught this on Linux. llvm-svn: 231899
*	Print section start labels when first switching to the section.	Rafael Espindola	2015-03-11	9	-2/+40
\| \| \| \| \| \| \|	This is less brittle and avoids polluting the start of the file with every debug section. llvm-svn: 231898
*	Split test in two to handle building without x86.	Rafael Espindola	2015-03-10	2	-3/+26
\| \| \| \|	llvm-svn: 231886
*	Add missing section symbol to COFF's .debug_types.dwo.	Rafael Espindola	2015-03-10	1	-0/+3
\| \| \| \| \| \| \| \| \|	Should bring the cygwin bots back. I added a triple to the test that was failing so that it would have failed on Linux. llvm-svn: 231882
*	If a conditional branch jumps to the same target, remove the condition	Philip Reames	2015-03-10	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass. I noticed this completely by accident in another test case. It's not anything that actually came from a real workload. p.s. We should probably do the same thing for switch instructions. Differential Revision: http://reviews.llvm.org/D8220 llvm-svn: 231881
*	Emit correct linkage-name attribute based on DWARF version.	Paul Robinson	2015-03-10	13	-19/+19
\| \| \| \| \| \| \| \| \| \|	There are still 4 tests that check for DW_AT_MIPS_linkage_name, because they specify DWARF 2 or 3 in the module metadata. So, I didn't create an explicit version-based test for the attribute. Differential Revision: http://reviews.llvm.org/D8227 llvm-svn: 231880
*	Infer known bits from dominating conditions	Philip Reames	2015-03-10	1	-0/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds limited support in ValueTracking for inferring known bits of a value from conditional expressions which must be true to reach the instruction we're trying to optimize. At this time, the feature is off by default. Once landed, I'm hoping for feedback from others on both profitability and compile time impact. Forms of conditional value propagation have been tried in LLVM before and have failed due to compile time problems. In an attempt to side step that, this patch only considers conditions where the edge leaving the branch dominates the context instruction. It does not attempt full dataflow. Even with that restriction, it handles many interesting cases: * Early exits from functions * Early exits from loops (for context instructions in the loop and after the check) * Conditions which control entry into loops, including multi-version loops (such as those produced during vectorization, IRCE, loop unswitch, etc..) Possible applications include optimizing using information provided by constructs such as: preconditions, assumptions, null checks, & range checks. This patch implements two approaches to the problem that need further benchmarking. Approach 1 is to directly walk the dominator tree looking for interesting conditions. Approach 2 is to inspect other uses of the value being queried for interesting comparisons. From initial benchmarking, it appears that Approach 2 is faster than Approach 1, but this needs to be further validated. Differential Revision: http://reviews.llvm.org/D7708 llvm-svn: 231879
*	[CodeGenPrepare] Refine the cost model provided by the promotion helper.	Quentin Colombet	2015-03-10	2	-3/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Use TargetLowering to check for the actual cost of each extension. - Provide a factorized method to check for the cost of an extension: TargetLowering::isExtFree. - Provide a virtual method TargetLowering::isExtFreeImpl for targets to be able to tune the cost of non-free extensions. This refactoring offers a better granularity to model what really happens on different targets. No performance changes and very few code differences. Part of <rdar://problem/19267165> llvm-svn: 231855
*	Add support for part-word atomics for PPC	Nemanja Ivanovic	2015-03-10	3	-9/+99
\| \| \| \| \| \|	http://reviews.llvm.org/D8090#inline-67337 llvm-svn: 231843
*	[AArch64] Avoid going through GPRs for across-vector instructions.	Ahmed Bougacha	2015-03-10	5	-5/+450
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds new node types for each intrinsic. For instance, for addv, we have AArch64ISD::UADDV, such that: (v4i32 (uaddv ...)) is the same as (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...)))) that is, (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), (i32 (int_aarch64_neon_uaddv ...)), ssub) In a combine, we transform all such across-vector-lanes intrinsics to: (i32 (extract_vector_elt (uaddv ...), 0)) This has one big advantage: by making the extract_element explicit, we enable the existing patterns for lane-aware instructions to fire. This lets us avoid needlessly going through the GPRs. Consider: uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) { return vmulq_n_u32(a, vaddvq_u32(b)); } We now generate: addv.4s s1, v1 mul.4s v0, v0, v1[0] instead of the previous: addv.4s s1, v1 fmov w8, s1 dup.4s v1, w8 mul.4s v0, v1, v0 rdar://20044838 llvm-svn: 231840
*	[AsmPrinter][TLOF] Reintroduce AArch64 test	Bruno Cardoso Lopes	2015-03-10	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \|	Follow up from r231505. Fix the non-determinism by using a MapVector and reintroduce the AArch64 testcase. Defer deleting the got candidates up to the end and remove them in a bulk, avoiding linear time removal of each element. Thanks to Renato Golin for trying it out on other platforms. llvm-svn: 231830
*	Change the generation of the vmuluwm instruction to be based on the MUL opcode.	Kit Barton	2015-03-10	2	-5/+6
\| \| \| \| \| \|	Phabricator review: http://reviews.llvm.org/D8185 llvm-svn: 231827
*	[LoopAccesses 3/3] Print the dependences with -analyze	Adam Nemet	2015-03-10	2	-66/+4
\| \| \| \| \| \| \| \| \| \|	The dependences are now expose through the new getInterestingDependences API so we can use that with -analyze too and fix the FIXME. This lets us remove the test that relied on -debug to check the dependences. llvm-svn: 231807
*	Teach lowering to correctly handle invoke statepoint and gc results tied to ↵	Igor Laevsky	2015-03-10	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \|	them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. (Resubmitting this change after not being able to reproduce buildbot failure) Differential Revision: http://reviews.llvm.org/D7760 llvm-svn: 231800
*	[X86, AVX] replace vinsertf128 intrinsics with generic shuffles	Sanjay Patel	2015-03-10	4	-107/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to replace as much custom x86 shuffling via intrinsics as possible because pushing the code down the generic shuffle optimization path allows for better codegen and less complexity in LLVM. This is the sibling patch for the Clang half of this change: http://reviews.llvm.org/D8088 Differential Revision: http://reviews.llvm.org/D8086 llvm-svn: 231794