bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Reimplement heuristic for estimating complete-unroll optimization effects.	Michael Zolotukhin	2015-05-12	2	-2/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156
*	[Mips] Return false for isFPCloseToIncomingSP()	Petar Jovanovic	2015-05-12	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	On Mips, frame pointer points to the same side of the frame as the stack pointer. This function is used to decide where to put register scavenging spill slot. So far, it was put on the wrong side of the frame, and thus it was too far away from $fp when frame was larger than 2^15 bytes. Patch by Vladimir Radosavljevic. http://reviews.llvm.org/D8895 llvm-svn: 237153
*	R600/SI: add pass to mark CF live ranges as non-spillable	Tom Stellard	2015-05-12	1	-0/+501
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spilling can insert instructions almost anywhere, and this can mess up control flow lowering in a multitude of ways, due to instruction reordering. Let's sort this out the easy way: never spill registers involved with control flow, i.e. saved EXEC masks. Unfortunately, this does not work at all with optimizations disabled, as the register allocator ignores spill weights. This should be addressed in a future commit. The test was reduced from the "stacks" shader of [1]. Some issues trigger the machine verifier while another one is checked manually. [1] http://madebyevan.com/webgl-path-tracing/ v2: only insert pass with optimizations enabled, merge test runs. Patch by: Grigori Goronzy llvm-svn: 237152
*	Changed renaming of local symbols by inserting a dot vefore the numeric suffix.	Sunil Srivastava	2015-05-12	28	-120/+120
\| \| \| \| \| \| \|	One code change and several test changes to match that details in http://reviews.llvm.org/D9481 llvm-svn: 237150
*	[DWARF] Add CIE header fields address_size and segment_size when generating ↵	Keith Walker	2015-05-12	3	-6/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	dwarf-4 The DWARF-4 specification added 2 new fields in the CIE header called address_size and segment_size. Create these 2 new fields when generating dwarf-4 CIE entries, print out the new fields when dumping the CIE and update tests Differential Revision: http://reviews.llvm.org/D9558 llvm-svn: 237145
*	R600/SI: Remove M0Reg register class	Tom Stellard	2015-05-12	1	-1/+1
\| \| \| \| \| \|	It is no longer used. llvm-svn: 237142
*	R600/SI: Remove explicit m0 operand from DS instructions	Tom Stellard	2015-05-12	2	-2/+2
\| \| \| \| \| \| \|	Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237141
*	R600/SI: Make sendmsg test more strict	Tom Stellard	2015-05-12	1	-0/+2
\| \| \| \| \| \|	We want to make sure that the m0 copies are being cse'd. llvm-svn: 237134
*	AVX-512, X86: Added lowering for shift operations for SKX.	Elena Demikhovsky	2015-05-12	1	-0/+29
\| \| \| \| \| \| \| \|	The other changes in the LowerShift() are not functional, just to make the code more convenient. So, the functional changes for SKX only. llvm-svn: 237129
*	[ARM] Use AEABI aligned function variants	John Brawn	2015-05-12	1	-31/+82
\| \| \| \| \| \| \| \| \| \| \|	AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127
*	Reverse ordering of base and derived pointer during safepoint lowering.	Igor Laevsky	2015-05-12	2	-21/+136
\| \| \| \| \| \| \| \|	According to the documentation in StackMap section for the safepoint we should have: "The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated." But before this change we emitted them in reverse order - derived pointer first, base pointer second. llvm-svn: 237126
*	[mips][FastISel] Handle calls with non legal types i8 and i16.	Vasileios Kalintiris	2015-05-12	1	-0/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel. Based on a patch by Reed Kotler. Test Plan: "Make check" test forthcoming. Test-suite passes at O0/O2 and with mips32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6770 llvm-svn: 237121
*	[mips][FastISel] Simplify callabi.ll by using multiple check prefixes.	Vasileios Kalintiris	2015-05-12	1	-397/+274
\| \| \| \| \| \| \| \| \| \|	Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9635 llvm-svn: 237119
*	[mips][FastISel] Allow computation of addresses from constant expressions.	Vasileios Kalintiris	2015-05-12	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Try to compute addresses when the offset from a memory location is a constant expression. Based on a patch by Reed Kotler. Test Plan: Passes test-suite for -O0/O2 and mips 32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D6767 llvm-svn: 237117
*	AVX-512: asm parser errors check	Elena Demikhovsky	2015-05-12	1	-0/+6
\| \| \| \| \| \| \|	I reverted the error check that was removed in 236416. I put the it in a separate file. llvm-svn: 237107
*	AVX-512: select operation for i1 vectors	Elena Demikhovsky	2015-05-12	1	-0/+75
\| \| \| \| \| \| \| \|	like: select i1 %cond, <16 x i1> %a, <16 x i1> %b. I added pseudo-CMOV patterns to resolve the "select". Added tests for KNL and SKX. llvm-svn: 237106
*	[X86] DAGCombine should not assume arbitrary vector types are simple	Michael Kuperstein	2015-05-12	1	-0/+11
\| \| \| \| \| \| \| \| \|	The X86-specific DAGCombine for stores should not assume vector types are always simple. This fixes PR23476. Differential Revision: http://reviews.llvm.org/D9659 llvm-svn: 237097
*	Migrate existing backends that care about software floating point	Eric Christopher	2015-05-12	5	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079
*	[MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.	Ahmed Bougacha	2015-05-11	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	This fixes another miscompile introduced by r235232: when there was a dependency on the memcpy destination other than the memset, we would ignore it, because we only looked at the source dependency. It was a mistake to use SrcDepInfo. Instead, just use DepInfo. llvm-svn: 237066
*	[WinEH] Handle nested landing pads that return directly to the parent function.	Andrew Kaylor	2015-05-11	5	-5/+219
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D9684 llvm-svn: 237063
*	[WinEH] Update exception numbering to give handlers their own base state.	Andrew Kaylor	2015-05-11	4	-3/+300
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D9512 llvm-svn: 237014
*	[RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to ↵	Sanjoy Das	2015-05-11	8	-28/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector of pointers Summary: In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for each relocated pointer, and the gc_relocate has the same type with the pointer. During the creation of gc_relocate intrinsic, llvm requires to mangle its type. However, llvm does not support mangling of all possible types. RewriteStatepointsForGC will hit an assertion failure when it tries to create a gc_relocate for pointer to vector of pointers because mangling for vector of pointers is not supported. This patch changes the way RewriteStatepointsForGC pass creates gc_relocate. For each relocated pointer, we erase the type of pointers and create an unified gc_relocate of type i8 addrspace(1)*. Then a bitcast is inserted to convert the gc_relocate to the correct type. In this way, gc_relocate does not need to deal with different types of pointers and the unsupported type mangling is no longer a problem. This change would also ease further merge when LLVM erases types of pointers and introduces an unified pointer type. Some minor changes are also introduced to gc_relocate related part in InstCombineCalls, CodeGenPrepare, and Verifier accordingly. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9592 llvm-svn: 237009
*	[X86] Updates to X86 backend for f16 promotion	Pirama Arumuga Nainar	2015-05-11	1	-7/+201
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r235215 adds support for f16 to be considered as a load/store type and promote f16 operations to f32. This patch has miscellaneous fixes for the X86 backend so all f16 operations are handled: 1. Set loadextaction for f16 vectors to expand. 2. Handle FP_EXTEND in a switch statement when handling v2f32 3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or (store (SINT_TO_FP )) to a FILD. Tests included. Reviewers: ab, srhines, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9092 llvm-svn: 237004
*	[Testsuite] Renumber metadata in ScopedNoAliasAA test to match CHECK lines	Adam Nemet	2015-05-11	1	-13/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Now it's much easier to follow what's happening in this test. Also removed some unused metadata entries. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9601 llvm-svn: 236981
*	AVX-512: Changed CC parameter in "cmp" intrinsic	Elena Demikhovsky	2015-05-11	4	-408/+408
\| \| \| \| \| \| \| \|	from i8 to i32 according to the Intel Spec by Igor Breger (igor.breger@intel.com) llvm-svn: 236979
*	[InstCombine/PowerPC] Fix single-precision QPX load/store replacement	Hal Finkel	2015-05-11	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	The QPX single-precision load/store intrinsics have implied truncation/extension from/to the declared value type of <4 x double> to the memory type of <4 x float>. When we can prove the alignment of the pointer argument, and thus replace the intrinsic with a regular load or store, we need to load or store the correct data type (<4 x float>) instead of (<4 x double>). llvm-svn: 236973
*	AVX-512: Added SKX instructions and intrinsics:	Elena Demikhovsky	2015-05-11	4	-30/+1218
\| \| \| \| \| \| \| \|	{add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae By Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236971
*	Make buildbots happy	David Majnemer	2015-05-11	1	-1/+1
\| \| \| \|	llvm-svn: 236970
*	[InstCombine] Canonicalize single element array store	David Majnemer	2015-05-11	1	-0/+20
\| \| \| \| \| \| \| \|	Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9591 llvm-svn: 236969
*	[InstCombine] Canonicalize single element array load	David Majnemer	2015-05-11	1	-0/+25
\| \| \| \| \| \| \| \|	Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9596 llvm-svn: 236968
*	AVX-512: fixed UINT_TO_FP operation for 512-bit types.	Elena Demikhovsky	2015-05-10	1	-0/+17
\| \| \| \|	llvm-svn: 236955
*	AVX-512: fixed a bug in i1 vectors lowering	Elena Demikhovsky	2015-05-10	1	-1/+50
\| \| \| \|	llvm-svn: 236947
*	llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/	NAKAMURA Takumi	2015-05-09	1	-1/+1
\| \| \| \|	llvm-svn: 236928
*	Fix MergeConsecutiveStore for non-byte-sized memory accesses.	James Y Knight	2015-05-09	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug showed up as a compile-time assertion failure: Assertion `NumBits >= MIN_INT_BITS && "bitwidth too small"' failed when building msan tests on x86-64. Prior to r236850, this bug was masked due to a bogus alignment check, which also accidentally rejected non-byte-sized accesses. Afterwards, an invalid ElementSizeBytes == 0 got further into the function, and triggered the assertion failure. It would probably be a good idea to allow it to handle merging stores of unusual widths as well, but for now, to un-break it, I'm just making the minimal fix. Differential Revision: http://reviews.llvm.org/D9626 llvm-svn: 236927
*	[Fast-ISel] Don't mark the first use of a remat constant as killed.	Pete Cooper	2015-05-09	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to mark the vreg containing 1000 as killed. Given that we go bottom up in fast-isel, a later use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill. However, rematerialised constant expressions aren't generated bottom up. The local value save area grows downwards. This means that if you remat 2 constant expressions which both use 1000 then the first will kill it, then the second, which is lower in the BB will read a killed register. This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset. Note that this commit only makes kill flag generation conservative. There's nothing else obviously wrong with the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions. However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely. llvm-svn: 236922
*	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not ↵	Arnold Schwaighofer	2015-05-08	4	-13/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916
*	Switch lowering: cluster adjacent fall-through cases even at -O0	Hans Wennborg	2015-05-08	1	-3/+12
\| \| \| \| \| \| \|	It's cheap to do, and codegen is much faster if cases can be merged into clusters. llvm-svn: 236905
*	[Fast-ISel] Clear kill flags on registers replaced by updateValueMap.	Pete Cooper	2015-05-08	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register. This is done via updateValueMap,. However, its possible that the source register we are rewriting to to also have uses. If those uses are after a kill of the value we are rewriting from then we have uses after a kill and the verifier fails. This code checks for the case where the to register is also used, and if so it clears all kill on the from register. This is conservative, but better that always clearing kills on the from register. llvm-svn: 236897
*	[Hexagon] Generate more hardware loops	Brendon Cahoon	2015-05-08	7	-89/+450
\| \| \| \| \| \| \| \| \|	Refactored parts of the hardware loop pass to generate more. Also, added more tests. Differential Revision: http://reviews.llvm.org/D9568 llvm-svn: 236896
*	[BasicAA] Fix zext & sext handling	Sanjoy Das	2015-05-08	1	-0/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are several unhandled edge cases in BasicAA's GetLinearExpression method. This changes fixes outstanding issues, including zext / sext of a constant with the sign bit set, and the refusal to decompose zexts or sexts of wrapping arithmetic. Test Plan: Unit tests added in //q.ext.ll//. Patch by Nick White. Reviewers: hfinkel, sanjoy Reviewed By: hfinkel, sanjoy Subscribers: sanjoy, llvm-commits, hfinkel Differential Revision: http://reviews.llvm.org/D6682 llvm-svn: 236894
*	[X86] Fast-ISel was incorrectly always killing the source of a truncate.	Pete Cooper	2015-05-08	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \|	A trunc from i32 to i1 on x86_64 generates an instruction such as %vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9 However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one. Otherwise, we are killing the source of the truncate which could be used later in the program. llvm-svn: 236890
*	Extend the statepoint intrinsic to allow statepoints to be marked as ↵	Pat Gavlin	2015-05-08	34	-149/+282
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888
*	[NoTTI] reject negative scale in addressing mode	Jingyue Wu	2015-05-08	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I noticed this bug when deubging a WIP on LSR. I wonder whether and how we should add a regression test for this. Test Plan: no tests failed. Reviewers: atrick Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D9536 llvm-svn: 236887
*	Clear kill flags on all used registers when sinking instructions.	Pete Cooper	2015-05-08	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \|	The test here was sinking the AND here to a lower BB: %vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8 TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8 which meant that vreg8 was read after it was killed. This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND. llvm-svn: 236886
*	Remove duplicate cmake target I added in r236792.	Pete Cooper	2015-05-08	1	-1/+0
\| \| \| \| \| \|	Thanks to Daniel Jasper for pointing out the mistake. llvm-svn: 236881
*	[Hexagon] Update AnalyzeBranch, etc target hooks	Brendon Cahoon	2015-05-08	2	-0/+655
\| \| \| \| \| \| \| \| \| \| \| \| \|	Improved the AnalyzeBranch, InsertBranch, and RemoveBranch functions in order to handle more of our branch instructions. This requires changes to analyzeCompare and PredicateInstructions. Specifically, we've added support for new value compare jumps, improved handling of endloop, added more compare instructions, and improved support for predicate instructions. Differential Revision: http://reviews.llvm.org/D9559 llvm-svn: 236876
*	[X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when ↵	Andrea Di Biagio	2015-05-08	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	decoding a PSHUFB mask. The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes where the mask node is a load from constant pool, and the constant pool node is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching it how to also look through X86ISD::WrapperRIP. This helps function combineX86ShufflesRecusively to combine more shuffle sequences containing PSHUFB nodes if we are in RIPRel PIC mode. Before this change, llc (with -relocation-model=pic -march=x86-64) was unable to decode a pshufb where the mask was loaded from a constant pool. For example, the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its operand, so instead of generating a single 'movaps' the backend always generated a sub-optimal 'movdqa + pshufb' sequence. Added test x86-fold-pshufb.ll. llvm-svn: 236863
*	[mips][microMIPSr6] Implement ALUIPC and AUIPC instructions	Jozef Kolek	2015-05-08	2	-0/+6
\| \| \| \| \| \| \| \|	This patch implements ALUIPC and AUIPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8441 llvm-svn: 236858
*	Fix test added in r236850 for OSX builders.	James Y Knight	2015-05-08	1	-1/+1
\| \| \| \| \| \| \|	Need to specify triple so that llvm emits the asm syntax that the test expected. llvm-svn: 236855
*	[mips][microMIPSr6] Implement ADDIUPC and LWPC instructions	Jozef Kolek	2015-05-08	2	-0/+6
\| \| \| \| \| \| \| \|	This patch implements ADDIUPC and LWPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8415 llvm-svn: 236852