summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* This reverts commit r265505.Kit Barton2016-04-287-268/+0
| | | | | | | Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance". This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe. llvm-svn: 267927
* [Hexagon] Add instruction aliases for vector unsigned compare-equalKrzysztof Parzyszek2016-04-281-0/+65
| | | | | | Unsigned compare-equal instructions are mapped to signed compare-equal. llvm-svn: 267925
* AMDGPU: Emit error if too much LDS is usedMatt Arsenault2016-04-281-0/+5
| | | | llvm-svn: 267922
* AMDGPU: Fix mishandling array allocations when promoting allocaMatt Arsenault2016-04-281-1/+3
| | | | | | | | The canonical form for allocas is a single allocation of the array type. In case we see a non-canonical array alloca, make sure we aren't replacing this with an array N times smaller. llvm-svn: 267916
* [Hexagon] Define certain aliases for vector instructionsKrzysztof Parzyszek2016-04-283-0/+43
| | | | | | | | | Specifically: Vd = #0 -> Vd = vxor(Vd, Vd) Vdd = #0 -> Vdd.w = vsub(Vdd.w, Vdd.w) Vdd = Vss -> Vdd = vcombine(Vss.H, Vss.L) llvm-svn: 267901
* [mips][atomics] Fix partword atomic binary operation implementationSimon Dardis2016-04-283-6/+15
| | | | | | | | | | | | | | | Currently Mips::emitAtomicBinaryPartword() does not properly respect the width of pointers. For MIPS64 this causes the memory address that the ll/sc sequence uses to be truncated. At runtime this causes a segmentation fault. This can be fixed by applying similar changes as r266204, so that a full 64bit pointer is loaded. Reviewers: dsanders Differential Review: http://reviews.llvm.org/D19651 llvm-svn: 267900
* [Hexagon] Handle double-vector registers as new-value producersKrzysztof Parzyszek2016-04-283-3/+42
| | | | | | Patch by Colin LeMahieu. llvm-svn: 267897
* [RDF] Handle undefined registers in RDF copy propagationKrzysztof Parzyszek2016-04-281-1/+6
| | | | | | | When updating the graph, make sure that new uses without reaching defs are handled correctly. llvm-svn: 267891
* [X86] Remove unused operand from a function and all its callers. NFCCraig Topper2016-04-285-10/+8
| | | | llvm-svn: 267854
* [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵Craig Topper2016-04-2814-85/+10
| | | | | | TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
* [AArch64] Expand CTTZ for all vector types.Craig Topper2016-04-281-0/+9
| | | | llvm-svn: 267837
* [SystemZ] Support Swift Calling ConventionBryan Chan2016-04-284-3/+31
| | | | | | | | | | | | | | | | Summary: Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see: RFC: Implementing the Swift calling convention in LLVM and Clang https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0 Reviewers: kbarton, manmanren, rjmccall, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19414 llvm-svn: 267823
* [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.Mitch Bodart2016-04-272-12/+36
| | | | | | | | | For compilations with no explicit cpu specified, this exhibits nice gains on Silvermont, with neutral performance on big cores. Differential Revision: http://reviews.llvm.org/D19138 llvm-svn: 267809
* [X86][FastISel] Make sure we use the right register class when we select stores.Quentin Colombet2016-04-271-1/+9
| | | | llvm-svn: 267806
* [Hexagon] Merging nops in to previous packet rather than always creating a ↵Colin LeMahieu2016-04-271-17/+69
| | | | | | new one. llvm-svn: 267798
* [X86] Fix the lowering of TLS calls.Quentin Colombet2016-04-272-6/+9
| | | | | | | | | | | The callseq_end node must be glued with the TLS calls, otherwise, the generic code will miss the uses of the returned value and will mark it dead. Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI, the pseudo uses the symbol address at this point not RDI and the lowering will do the right thing. llvm-svn: 267797
* AMDGPU: Account for globals in AMDGPUPromoteAlloca passMatt Arsenault2016-04-271-2/+4
| | | | | | Patch by Bas Nieuwenhuizen llvm-svn: 267791
* [ARM] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.Ahmed Bougacha2016-04-271-2/+2
| | | | | | | | | We run after PEI. Found via inspection; no obvious testcase. Follow-up to r266679. llvm-svn: 267781
* [AArch64] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.Ahmed Bougacha2016-04-271-2/+2
| | | | | | | | | We run after PEI. Found via inspection; no obvious testcase. Follow-up to r266339. llvm-svn: 267780
* [AArch64] Set correct successors in CMPXCHG pseudo expansion.Ahmed Bougacha2016-04-271-2/+4
| | | | | | | | | | | transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. Follow-up to r266339. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267779
* [ARM] Set correct successors in CMPXCHG pseudo expansion.Ahmed Bougacha2016-04-271-2/+4
| | | | | | | | | | | | | | transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. The testcase changes are caused by Thumb2SizeReduction, which was previously confused by the broken CFG. Follow-up to r266679. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267778
* [X86]: Quit promoting 16 bit loads to 32 bit.Kevin B. Smith2016-04-271-17/+0
| | | | | | Differential Revision: http://reviews.llvm.org/D19592 llvm-svn: 267773
* Add optimization bisect opt-in calls for PowerPC passesAndrew Kaylor2016-04-279-3/+28
| | | | | | Differential Revision: http://reviews.llvm.org/D19554 llvm-svn: 267769
* [NVPTX] Run NVVMReflect at the beginning of IR passes.Justin Lebar2016-04-272-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the NVVMReflect pass is run at the beginning of our backend passes. But really, it should be run as early as possible, as it's simply resolving an "if" statement in code. So copy it into TargetMachine::addEarlyAsPossiblePasses. We still run it at the beginning of the backend passes, since it's needed for correctness when lowering to nvptx. (Specifically, NVVMReflect changes each call to the __nvvm_reflect function or llvm.nvvm.reflect intrinsic into an integer constant, based on the pass's configuration. Clearly we miss many optimization opportunities if we perform this transformation at the beginning of codegen.) Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18616 llvm-svn: 267765
* Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for ↵Chad Rosier2016-04-276-39/+13
| | | | | | | | SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752
* [DAGCombiner] Follow coding convention for function name (NFC)Gerolf Hoflehner2016-04-272-2/+2
| | | | llvm-svn: 267745
* [Mips] Add support for llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-271-0/+4
| | | | | | | | This will be used to implement __builtin_thread_pointer in clang. Differential Revision: http://reviews.llvm.org/D19569 llvm-svn: 267743
* Silence a -Wdangling-elseReid Kleckner2016-04-271-1/+2
| | | | llvm-svn: 267737
* Add parentheses to silence buildbot warningMatthew Simpson2016-04-271-2/+2
| | | | llvm-svn: 267734
* [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.Artem Tamazov2016-04-276-13/+39
| | | | | | | | | | | Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733
* AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsicNicolai Haehnle2016-04-272-14/+78
| | | | | | | | | | | | | | | | | Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-272-0/+58
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware ↵Artem Tamazov2016-04-272-13/+42
| | | | | | | | | | | | registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724
* Revert r267649, it caused PR27539.Nico Weber2016-04-271-136/+0
| | | | llvm-svn: 267723
* [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU ↵Zlatko Buljan2016-04-275-20/+70
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D16676 llvm-svn: 267694
* [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, ↵Zlatko Buljan2016-04-273-12/+41
| | | | | | | | SRAV, SRL and SRLV instructions Differential Revision: http://reviews.llvm.org/D17989 llvm-svn: 267693
* [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state ↵Chuang-Yu Cheng2016-04-271-4/+10
| | | | | | | | | | | | instead of implicit This fixes PR27414 Reviewers: kbarton mgrang tjablin http://reviews.llvm.org/D19255 llvm-svn: 267660
* [X86] Set AddPristinesAndCSRs to FixupBW LivePhysRegs. NFC.Ahmed Bougacha2016-04-271-1/+2
| | | | | | | | | | | | | We run after PEI, so we need to AddPristinesAndCSRs. In practice, that makes no difference here, because we only ask about liveness of super-registers of defined GR8/GR16 registers, so they can't be pristine. Still, it's the correct thing to do. Thanks to Quentin for noticing! Follow-up to r267495. llvm-svn: 267658
* [X86] Don't assume that MMX extractelts are from index 0.Ahmed Bougacha2016-04-271-1/+3
| | | | | | | It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652
* [X86] Re-enable MMX i32 extractelt combine.Ahmed Bougacha2016-04-271-10/+3
| | | | | | | | | This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-271-0/+136
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* Add optimization bisect opt-in calls for SystemZ passesAndrew Kaylor2016-04-263-0/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D19562 llvm-svn: 267636
* Add optimization bisect opt-in calls for NVPTX passesAndrew Kaylor2016-04-265-1/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D19518 llvm-svn: 267635
* [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosingQuentin Colombet2016-04-262-0/+16
| | | | | | | | | | | | | | | the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634
* [X86] Teach the expansion of copy instructions how to do proper liveness.Quentin Colombet2016-04-261-15/+22
| | | | | | | When the simple analysis provided by MachineBasicBlock::computeRegisterLiveness fails, fall back on the LivePhysReg utility. llvm-svn: 267623
* [NVPTX] Fix some usages of CodeGenOpt::None.Jingyue Wu2016-04-261-5/+9
| | | | | | | | | | NVPTXLowerKernelArgs is required for correctness, so it should not be guarded by CodeGenOpt::None. NVPTXPeephole is optimization only, so it should be skipped when CodeGenOpt::None. llvm-svn: 267619
* Optimization bisect support in X86-specific passesAndrew Kaylor2016-04-265-3/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D19439 llvm-svn: 267608
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-2610-138/+116
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* Add optimization bisect opt-in calls for Hexagon passesAndrew Kaylor2016-04-2618-1/+52
| | | | | | Differential Revision: http://reviews.llvm.org/D19509 llvm-svn: 267593
* Swift Calling Convention: use %RAX for sret.Manman Ren2016-04-263-1/+16
| | | | | | | We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579
OpenPOWER on IntegriCloud