summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [SPARC] Add support for llvm.thread.pointer.Marcin Koscielnicki2016-04-262-0/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19387 llvm-svn: 267544
* [ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library ↵Chuang-Yu Cheng2016-04-261-1/+1
| | | | | | | | | | | tail-call issue print-stack-trace.cc test failure of compiler-rt has been fixed by r266869 (http://reviews.llvm.org/D19148), so reenable sibling call optimization on ppc64 Reviewers: nemanjai kbarton llvm-svn: 267527
* [AArch64] Expand v1i64 and v2i64 ctlz.Craig Topper2016-04-261-0/+3
| | | | | | The default is legal, which results in 'Cannot select' errors. llvm-svn: 267522
* [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.Craig Topper2016-04-261-0/+10
| | | | | | The default is Legal, which results in 'Cannot select' errors. llvm-svn: 267521
* [ARM] Expand v1i64 and v2i64 ctlz.Craig Topper2016-04-261-0/+3
| | | | | | The default is legal, which results in 'Cannot select' errors. llvm-svn: 267520
* [WebAssembly] Account for implicit operands when computing operand indices.Dan Gohman2016-04-261-1/+1
| | | | llvm-svn: 267511
* Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots.Andrew Kaylor2016-04-261-2/+1
| | | | llvm-svn: 267506
* Remove MinLatency in SchedMachineModel. NFC.Junmo Park2016-04-2611-14/+0
| | | | | | | | | | | Summary: We don't use MinLatency any more since r184032. Reviewers: atrick, hfinkel, mcrosier Differential Revision: http://reviews.llvm.org/D19474 llvm-svn: 267502
* [X86] Use LivePhysRegs in X86FixupBWInsts.Ahmed Bougacha2016-04-261-13/+19
| | | | | | | | | Kill-flags, which computeRegisterLiveness uses, are not reliable. LivePhysRegs is. Differential Revision: http://reviews.llvm.org/D19472 llvm-svn: 267495
* [Sparc] Fix double-float fabs and fneg on little endian CPUs.James Y Knight2016-04-251-12/+28
| | | | | | | | | | | | | | | | The SparcV8 fneg and fabs instructions interestingly come only in a single-float variant. Since the sign bit is always the topmost bit no matter what size float it is, you simply operate on the high subregister, as if it were a single float. However, the layout of double-floats in the float registers is reversed on little-endian CPUs, so that the high bits are in the second subregister, rather than the first. Thus, this expansion must check the endianness to use the correct subregister. llvm-svn: 267489
* Fix build warningAndrew Kaylor2016-04-251-1/+1
| | | | llvm-svn: 267487
* Add optimization bisect opt-in calls for AMDGPU passesAndrew Kaylor2016-04-257-1/+19
| | | | | | Differential Revision: http://reviews.llvm.org/D19450 llvm-svn: 267485
* Add optimization bisect opt-in calls for ARM passesAndrew Kaylor2016-04-255-2/+15
| | | | | | Differential Revision: http://reviews.llvm.org/D19449 llvm-svn: 267480
* Add optimization bisect opt-in calls for AArch64 passesAndrew Kaylor2016-04-2512-0/+34
| | | | | | Differential Revision: http://reviews.llvm.org/D19394 llvm-svn: 267479
* ARM: put extern __thread stubs in a special section.Tim Northover2016-04-251-2/+18
| | | | | | | The linker needs to know that the symbols are thread-local to do its job properly. llvm-svn: 267473
* [Hexagon] Few fixes for exception handlingKrzysztof Parzyszek2016-04-252-1/+2
| | | | llvm-svn: 267469
* Re-apply r267206 with a fix for the encoding problem: when the immediate ofQuentin Colombet2016-04-251-3/+14
| | | | | | | | | | | | | | | | | | log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit variant cannot encode it. Therefore, set the subreg part accordingly. [AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267465
* AMDGPU/SI: Optimize adjacent s_nop instructionsMatt Arsenault2016-04-251-0/+27
| | | | | | | | | | | | Use the operand for how long to wait. This is somewhat distasteful, since it would be better to just emit s_nop with the right argument in the first place. This would require changing TII::insertNoop to emit N operands, which would be easy. Slightly more problematic is the post-RA scheduler and hazard recognizer represent nops as a single null node, and would require inventing another way of representing N nops. llvm-svn: 267456
* AMDGPU: Implement addrspacecastMatt Arsenault2016-04-255-71/+124
| | | | llvm-svn: 267452
* AMDGPU: Add queue ptr intrinsicMatt Arsenault2016-04-254-3/+18
| | | | llvm-svn: 267451
* AMDGPU: Add DAG to debug dumpMatt Arsenault2016-04-251-2/+2
| | | | | | Also reorder case to match enum order llvm-svn: 267449
* [Hexagon] Register save/restore functions do not follow regular conventionsKrzysztof Parzyszek2016-04-254-45/+51
| | | | | | Do not mark them as modifying any of the volatile registers by default. llvm-svn: 267433
* [lanai] Expand findClosestSuitableAluInstr check to consider offset register.Jacques Pienaar2016-04-251-3/+6
| | | | | | Previously findClosestSuitableAluInstr was only considering the base register when checking the current instruction for suitability. Expand check to consider the offset if the offset is a register. llvm-svn: 267424
* [mips][microMIPS] Revert commit r267137Hrvoje Varga2016-04-254-14/+3
| | | | | | Commit r267137 was the reason for failing tests in LLVM test suite. llvm-svn: 267419
* [mips][microMIPS] Revert commit r266977Zlatko Buljan2016-04-254-26/+9
| | | | | | Commit r266977 was reason for failing LLVM test suite with error message: fatal error: error in backend: Cannot select: t17: i32 = rotr t2, t11 ... llvm-svn: 267418
* Fix incorrect redundant expression in target AMDGPU.Etienne Bergeron2016-04-251-1/+1
| | | | | | | | | | | | | | | | | | | Summary: The expression is detected as a redundant expression. Turn out, this is probably a bug. ``` /home/etienneb/llvm/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:306:26: warning: both side of operator are equivalent [misc-redundant-expression] if (isSMRD(*FirstLdSt) && isSMRD(*FirstLdSt)) { ``` Reviewers: rnk, tstellarAMD Subscribers: arsenm, cfe-commits Differential Revision: http://reviews.llvm.org/D19460 llvm-svn: 267415
* [ARM] Add support for the X asm constraintSilviu Baranga2016-04-252-0/+22
| | | | | | | | | | | | | | | | | | Summary: This patch adds support for the X asm constraint. To do this, we lower the constraint to either a "w" or "r" constraint depending on the operand type (both constraints are supported on ARM). Fixes PR26493 Reviewers: t.p.northover, echristo, rengolin Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D19061 llvm-svn: 267411
* [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.Artem Tamazov2016-04-255-3/+127
| | | | | | | | | | | | | Added hwreg(reg[,offset,width]) syntax. Default offset = 0, default width = 32. Possibility to specify 16-bit immediate kept. Added out-of-range checks. Disassembling is always to hwreg(...) format. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19329 llvm-svn: 267410
* [Hexagon] Correctly set "Flags" in ELF headerKrzysztof Parzyszek2016-04-251-3/+7
| | | | llvm-svn: 267397
* [PowerPC] [PR27387] Disallow r0 for ADD8TLS.Marcin Koscielnicki2016-04-251-2/+4
| | | | | | | | | | | ADD8TLS, a variant of add instruction used for initial-exec TLS, currently accepts r0 as a source register. While add itself supports r0 just fine, linker can relax it to a local-exec sequence, converting it to addi - which doesn't support r0. Differential Revision: http://reviews.llvm.org/D19193 llvm-svn: 267388
* [X86] Replace a SmallVector used to pass 2 values to an ArrayRef parameter ↵Craig Topper2016-04-251-3/+1
| | | | | | with a fixed size array. NFC llvm-svn: 267377
* Minor code cleanups. NFC.Junmo Park2016-04-254-23/+23
| | | | llvm-svn: 267375
* ARM: fix __chkstk Frame Setup on WoASaleem Abdulrasool2016-04-242-2/+4
| | | | | | | | | | | | This corrects the MI annotations for the stack adjustment following the __chkstk invocation. We were marking the original SP usage as a Def rather than Kill. The (new) assigned value is the definition, the original reference is killed. Adjust the ISelLowering to mark Kills and FrameSetup as well. This partially resolves PR27480. llvm-svn: 267361
* Fix typo in comment. NFCNick Lewycky2016-04-241-1/+1
| | | | llvm-svn: 267354
* [X86][SSE] getTargetShuffleMaskIndices - dropped (unused) UNDEF handlingSimon Pilgrim2016-04-241-5/+0
| | | | | | We aren't currently making use of this in any successful mask decode and its actually incorrect as it inserts the wrong number of SM_SentinelUndef mask elements. llvm-svn: 267350
* [X86][SSE] Use range loop. NFCI.Simon Pilgrim2016-04-241-3/+2
| | | | llvm-svn: 267349
* [Lanai] Use EVT::getEVTString() to print a type as a string instead of an ↵Craig Topper2016-04-241-1/+1
| | | | | | enum encoding value. llvm-svn: 267348
* [X86][XOP] Fixed VPPERM permute op decoding (PR27472).Simon Pilgrim2016-04-241-1/+1
| | | | | | Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask. llvm-svn: 267346
* [X86][SSE] Improved support for decoding target shuffle masks through bitcastsSimon Pilgrim2016-04-241-20/+26
| | | | | | | | Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm. This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472. llvm-svn: 267343
* [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.Marcin Koscielnicki2016-04-244-0/+52
| | | | | | | | | | | | | | This fixes PR22248 on s390x. The previous attempt at this was D19101, which was before LOAD_STACK_GUARD existed. Compared to the previous version, this always emits a rather ugly block of 4 instructions, involving a thread pointer load that can't be shared with other potential users. However, this is necessary for SSP - spilling the guard value (or thread pointer used to load it) is counter to the goal, since it could be overwritten along with the frame it protects. Differential Revision: http://reviews.llvm.org/D19363 llvm-svn: 267340
* [X86] Merge LowerCTLZ and LowerCTLZ_ZERO_UNDEF into a single function that ↵Craig Topper2016-04-241-38/+16
| | | | | | branches internally for the one difference, allowing the rest of the code to be common. NFC llvm-svn: 267331
* [X86] Node need to check if AVX512 is supported when lowering vector CTLZ. ↵Craig Topper2016-04-241-7/+5
| | | | | | The CTLZ operation is only Custom for vectors if AVX512 is enabled so if a vector gets here AVX512 is implied. NFC llvm-svn: 267330
* [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner2016-04-244-35/+557
| | | | | | | | | | | | | | | | | | | The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
* [X86] Remove isel patterns for selecting tzcnt/lzcnt from ↵Craig Topper2016-04-241-80/+0
| | | | | | cmove/ne+cttz/ctlz. These are folded by DAG combine now. llvm-svn: 267326
* Fix an assertion that can never fire because the condition ANDed with the ↵Craig Topper2016-04-241-1/+1
| | | | | | string is just true or 1. llvm-svn: 267324
* Fix a couple assertions that can never fire because they just contained the ↵Craig Topper2016-04-242-2/+2
| | | | | | text string which always evaluates to true. Add a ! so they'll evaluate to false. llvm-svn: 267312
* [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt ↵Craig Topper2016-04-241-30/+24
| | | | | | instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly. llvm-svn: 267311
* [MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.Davide Italiano2016-04-242-5/+25
| | | | | | | | | The option to control the emission of the new relocations is -relax-relocations (blatantly copied from GNU as). It can't be enabled by default because it breaks relatively recent versions of ld.bfd/ld.gold (late 2015). llvm-svn: 267307
* [MC/ELF] Pass Fixup to getRelocType64.Davide Italiano2016-04-231-2/+3
| | | | | | In preparation for other changes. llvm-svn: 267300
* Revert "[AArch64] Fix optimizeCondBranch logic."Renato Golin2016-04-231-5/+5
| | | | | | This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294
OpenPOWER on IntegriCloud