summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: add pass to mark CF live ranges as non-spillableTom Stellard2015-05-124-0/+110
| | | | | | | | | | | | | | | | | | | | | | Spilling can insert instructions almost anywhere, and this can mess up control flow lowering in a multitude of ways, due to instruction reordering. Let's sort this out the easy way: never spill registers involved with control flow, i.e. saved EXEC masks. Unfortunately, this does not work at all with optimizations disabled, as the register allocator ignores spill weights. This should be addressed in a future commit. The test was reduced from the "stacks" shader of [1]. Some issues trigger the machine verifier while another one is checked manually. [1] http://madebyevan.com/webgl-path-tracing/ v2: only insert pass with optimizations enabled, merge test runs. Patch by: Grigori Goronzy llvm-svn: 237152
* use 'auto' to improve readability; NFCSanjay Patel2015-05-121-2/+1
| | | | llvm-svn: 237144
* R600/SI: Update tablegen defs to avoid restoring spilled sgprs to m0Tom Stellard2015-05-122-9/+4
| | | | | | | | We had code to do this in SIRegisterInfo::eliminateFrameIndex(), but it is easier to just change the definition of SI_SPILL_S32_RESTORE to only allow numbered sgprs. llvm-svn: 237143
* R600/SI: Remove M0Reg register classTom Stellard2015-05-123-4/+1
| | | | | | It is no longer used. llvm-svn: 237142
* R600/SI: Remove explicit m0 operand from DS instructionsTom Stellard2015-05-126-118/+259
| | | | | | | Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237141
* R600/SI: Remove explicit m0 operand from v_interp instructionsTom Stellard2015-05-126-33/+59
| | | | | | | Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237140
* R600/SI: Remove explicit m0 operand from s_sendmsgTom Stellard2015-05-126-8/+36
| | | | | | | | | | | | | | | Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133
* R600/SI: Replace TRI->getRegClass(Reg) with TRI->getPhysRegClass(Reg)Tom Stellard2015-05-123-7/+11
| | | | | | | TRI->getRegClass() takes a register class ID, not a register. We were using this incorrectly in a few places. llvm-svn: 237132
* AVX-512, X86: Added lowering for shift operations for SKX.Elena Demikhovsky2015-05-121-101/+65
| | | | | | | | The other changes in the LowerShift() are not functional, just to make the code more convenient. So, the functional changes for SKX only. llvm-svn: 237129
* [ARM] Use AEABI aligned function variantsJohn Brawn2015-05-122-51/+141
| | | | | | | | | | | AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127
* [X86] Remove useless target specific combine on TRUNCATE dag nodes.Andrea Di Biagio2015-05-121-11/+0
| | | | | | | | | | | | | Before revision 171146, function 'PerformTruncateCombine' used to perform a premature lowering of TRUNCATE dag nodes. Revision 171146 then moved all the logic implemented by PerformTruncateCombine to a custom lowering hook. However, that revision forgot to delete function PerformTruncateCombine from the code. This patch removes function 'PerformTruncateCombine' since it has no effect on the SelectionDAG. No functional change intended. llvm-svn: 237122
* [mips][FastISel] Handle calls with non legal types i8 and i16.Vasileios Kalintiris2015-05-121-1/+3
| | | | | | | | | | | | | | | | | | Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel. Based on a patch by Reed Kotler. Test Plan: "Make check" test forthcoming. Test-suite passes at O0/O2 and with mips32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6770 llvm-svn: 237121
* [mips][FastISel] Allow computation of addresses from constant expressions.Vasileios Kalintiris2015-05-121-2/+4
| | | | | | | | | | | | | | | | | | | Summary: Try to compute addresses when the offset from a memory location is a constant expression. Based on a patch by Reed Kotler. Test Plan: Passes test-suite for -O0/O2 and mips 32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D6767 llvm-svn: 237117
* Change TargetParser enum names to avoid macro conflicts (llvm)Renato Golin2015-05-123-68/+68
| | | | | | | | | sys/time.h on Solaris (and possibly other systems) defines "SEC" as "1" using a cpp macro. The result is that this fails to compile. Fixes https://llvm.org/PR23482 llvm-svn: 237112
* AVX-512: select operation for i1 vectorsElena Demikhovsky2015-05-122-0/+25
| | | | | | | | like: select i1 %cond, <16 x i1> %a, <16 x i1> %b. I added pseudo-CMOV patterns to resolve the "select". Added tests for KNL and SKX. llvm-svn: 237106
* [X86] DAGCombine should not assume arbitrary vector types are simpleMichael Kuperstein2015-05-121-1/+1
| | | | | | | | | The X86-specific DAGCombine for stores should not assume vector types are always simple. This fixes PR23476. Differential Revision: http://reviews.llvm.org/D9659 llvm-svn: 237097
* Migrate existing backends that care about software floating pointEric Christopher2015-05-1217-47/+80
| | | | | | | | | | | | | | | | | | | | to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079
* Readdress r236990, use of static members on a non-static variable.David Blaikie2015-05-111-9/+6
| | | | | | | | | | | | The TargetRegistry is just a namespace-like class, instantiated in one place to use a range-based for loop. Instead, expose access to the registry via a range-based 'targets()' function instead. This makes most uses a bit awkward/more verbose - but eventually we should just add a range-based find_if function which will streamline these functions. I'm happy to mkae them a bit awkward in the interim as encouragement to improve the algorithms in time. llvm-svn: 237059
* [X86] Updates to X86 backend for f16 promotionPirama Arumuga Nainar2015-05-111-0/+22
| | | | | | | | | | | | | | | | | | | | | | | Summary: r235215 adds support for f16 to be considered as a load/store type and promote f16 operations to f32. This patch has miscellaneous fixes for the X86 backend so all f16 operations are handled: 1. Set loadextaction for f16 vectors to expand. 2. Handle FP_EXTEND in a switch statement when handling v2f32 3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or (store (SINT_TO_FP )) to a FILD. Tests included. Reviewers: ab, srhines, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9092 llvm-svn: 237004
* Silencing an MSVC warning: '<<' : result of 32-bit shift implicitly ↵Aaron Ballman2015-05-111-1/+1
| | | | | | converted to 64 bits (was 64-bit shift intended?); NFC llvm-svn: 236987
* Fixed compilation warning, NFC.Elena Demikhovsky2015-05-111-0/+2
| | | | llvm-svn: 236972
* AVX-512: Added SKX instructions and intrinsics:Elena Demikhovsky2015-05-114-70/+101
| | | | | | | | {add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae By Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236971
* AVX-512: fixed UINT_TO_FP operation for 512-bit types.Elena Demikhovsky2015-05-101-0/+7
| | | | llvm-svn: 236955
* AVX-512: fixed a bug in i1 vectors loweringElena Demikhovsky2015-05-102-2/+12
| | | | llvm-svn: 236947
* SystemZ: silence a GCC warningSaleem Abdulrasool2015-05-101-2/+2
| | | | | | | | warning: enumeral and non-enumeral type in conditional expression Cast the 0 to the appropriate type. NFC. Identified by GCC 4.9.2 llvm-svn: 236942
* MachineCSE: Add a target query for the LookAheadLimit heurisiticTom Stellard2015-05-091-0/+2
| | | | | | | | | This is used to determine whether or not to CSE physical register defs. Differential Revision: http://reviews.llvm.org/D9472 llvm-svn: 236923
* Fix compile errorArnold Schwaighofer2015-05-091-1/+1
| | | | llvm-svn: 236921
* [Target/ARM] Remove unused 'private' from class.Davide Italiano2015-05-081-2/+0
| | | | | | | Differential Revision: http://reviews.llvm.org/D9611 Reviewed by: rengolin llvm-svn: 236918
* ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not ↵Arnold Schwaighofer2015-05-085-3/+11
| | | | | | | | | | | | | | | | | | | | non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916
* TargetParser: FPU/ARCH/EXT parsing refactory - NFCRenato Golin2015-05-089-293/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new class in a global context contain arch-specific knowledge in order to provide LLVM libraries, tools and projects with the ability to understand the architectures. For now, only FPU, ARCH and ARCH extensions on ARM are supported. Current behaviour it to parse from free-text to enum values and back, so that all users can share the same parser and codes. This simplifies a lot both the ASM/Obj streamers in the back-end (where this came from), and the front-end parsers for command line arguments (where this is going to be used next). The previous implementation, using .def/.h includes is deprecated due to its inflexibility to be built without the backend support and for being too cumbersome. As more architectures join this scheme, and as more features of such architectures are added (such as hardware features, type sizes, etc) into a full blown TargetDescription class, having a set of classes is the most sane implementation. The ultimate goal of this refactor both LLVM's and Clang's target description classes into one unique interface, so that we can de-duplicate and standardise the descriptions, as well as make it available for other front-ends, tools, etc. The FPU parsing for command line options in Clang has been converted to use this new library and a number of aliases were added for compatibility: * A bogus neon-vfpv3 alias (neon defaults to vfp3) * armv5/v6 * {fp4/fp5}-{sp/dp}-d16 Next steps: * Port Clang's ARCH/EXT parsing to use this library. * Create a TableGen back-end to generate this information. * Run this TableGen process regardless of which back-ends are built. * Expose more information and rename it to TargetDescription. * Continue re-factoring Clang to use as much of it as possible. llvm-svn: 236900
* [Hexagon] Generate more hardware loopsBrendon Cahoon2015-05-081-133/+206
| | | | | | | | | Refactored parts of the hardware loop pass to generate more. Also, added more tests. Differential Revision: http://reviews.llvm.org/D9568 llvm-svn: 236896
* [X86] Fast-ISel was incorrectly always killing the source of a truncate.Pete Cooper2015-05-081-1/+3
| | | | | | | | | | | A trunc from i32 to i1 on x86_64 generates an instruction such as %vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9 However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one. Otherwise, we are killing the source of the truncate which could be used later in the program. llvm-svn: 236890
* Extend the statepoint intrinsic to allow statepoints to be marked as ↵Pat Gavlin2015-05-082-0/+51
| | | | | | | | | | | | | | | | | | | | | | transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888
* [Hexagon] Update AnalyzeBranch, etc target hooksBrendon Cahoon2015-05-083-274/+350
| | | | | | | | | | | | | Improved the AnalyzeBranch, InsertBranch, and RemoveBranch functions in order to handle more of our branch instructions. This requires changes to analyzeCompare and PredicateInstructions. Specifically, we've added support for new value compare jumps, improved handling of endloop, added more compare instructions, and improved support for predicate instructions. Differential Revision: http://reviews.llvm.org/D9559 llvm-svn: 236876
* [X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when ↵Andrea Di Biagio2015-05-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | decoding a PSHUFB mask. The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes where the mask node is a load from constant pool, and the constant pool node is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching it how to also look through X86ISD::WrapperRIP. This helps function combineX86ShufflesRecusively to combine more shuffle sequences containing PSHUFB nodes if we are in RIPRel PIC mode. Before this change, llc (with -relocation-model=pic -march=x86-64) was unable to decode a pshufb where the mask was loaded from a constant pool. For example, the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its operand, so instead of generating a single 'movaps' the backend always generated a sub-optimal 'movdqa + pshufb' sequence. Added test x86-fold-pshufb.ll. llvm-svn: 236863
* [mips][microMIPSr6] Implement ALUIPC and AUIPC instructionsJozef Kolek2015-05-083-3/+32
| | | | | | | | This patch implements ALUIPC and AUIPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8441 llvm-svn: 236858
* [mips][microMIPSr6] Implement ADDIUPC and LWPC instructionsJozef Kolek2015-05-083-3/+31
| | | | | | | | This patch implements ADDIUPC and LWPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8415 llvm-svn: 236852
* Fix gcc warning of different enum and non-enum types in ternaryDenis Protivensky2015-05-081-8/+8
| | | | | | | Make '0' literal explicitly unsigned with '0u'. This appeared after r236775. llvm-svn: 236838
* [mips] Only use FGR_{32,64} in TableGen descriptions. NFC.Toma Tabacu2015-05-082-33/+23
| | | | | | | | | | | | | | Summary: Instead of explicitly adding the IsFP64bit and NotFP64bit predicates through AdditionalRequires. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9566 llvm-svn: 236835
* [mips] Emit the .insn directive for empty basic blocks.Vasileios Kalintiris2015-05-082-0/+7
| | | | | | | | | | | | | | | | | | | Summary: In microMIPS, labels need to know whether they are on code or data. This is indicated with STO_MIPS_MICROMIPS and can be inferred by being followed by instructions. For empty basic blocks, we can ensure this by emitting the .insn directive after the label. Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the exception handling regression tests under: SingleSource/Regression/C++/EH Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9530 llvm-svn: 236815
* InMips16HardFloat was only being set conditional on whether orEric Christopher2015-05-071-1/+1
| | | | | | | not IsSoftFloat was set so remove it from here simplifying the accessor. llvm-svn: 236795
* Rename the MIPS routine abiUsesSoftFloat -> useSoftFloat to matchEric Christopher2015-05-078-15/+13
| | | | | | some incoming changes and the general scheme used by features (use/has). llvm-svn: 236794
* Fix typo.Matthias Braun2015-05-071-1/+1
| | | | llvm-svn: 236785
* Change getTargetNodeName() to produce compiler warnings for missing cases, ↵Matthias Braun2015-05-0726-41/+125
| | | | | | fix them llvm-svn: 236775
* [AArch64] Fix sext/zext folding in address arithmetic.Pete Cooper2015-05-071-29/+32
| | | | | | | | We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there. Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use. llvm-svn: 236764
* Add VSX Scalar loads and stores to the PPC back endNemanja Ivanovic2015-05-078-8/+150
| | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D9440 It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers. llvm-svn: 236755
* [mips][microMIPSr6] Implement JIALC and JIC instructionsJozef Kolek2015-05-072-3/+30
| | | | | | | | This patch implements JIALC and JIC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8389 llvm-svn: 236748
* R600: Fix comment that mentions AMDILMatt Arsenault2015-05-071-2/+2
| | | | llvm-svn: 236745
* Use intrinsic pattern to make a simpler matchSanjay Patel2015-05-071-3/+2
| | | | | | | | | | | | This is a follow-on to r236740 where I took Andrea's advice in D9504 to remove a redundant pattern...except that I removed the wrong pattern! AFAICT, there is no change in the final code produced because subsequent passes would clean up the extra instructions created by the more complicated pattern. llvm-svn: 236743
* [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)Sanjay Patel2015-05-072-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finish the job that was abandoned in D6958 following the refactoring in http://reviews.llvm.org/rL230221: 1. Uncomment the intrinsic def for the AVX r_Int instruction. 2. Add missing r_Int entries to the load folding tables; there are already tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I haven't added any more in this patch. 3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ). So instead of this: movaps %xmm0, %xmm1 rcpss %xmm1, %xmm1 movss %xmm1, %xmm0 We should now get: rcpss %xmm0, %xmm0 And instead of this: vsqrtss %xmm0, %xmm0, %xmm1 vblendps $1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3] We should now get: vsqrtss %xmm0, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D9504 llvm-svn: 236740
OpenPOWER on IntegriCloud