summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Add vector demanded elements support to ↵Simon Pilgrim2017-03-3114-3/+15
| | | | | | | | | | computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201
* Temporarily revert "[PPC] In PPCBoolRetToInt change the bool value to i64 if ↵Eric Christopher2017-03-313-37/+19
| | | | | | | | the target is ppc64" as it's causing test failures, I've given Carrot a testcase offline. This reverts commit r298955. llvm-svn: 299153
* [WebAssembly] Initial linking metadata supportDan Gohman2017-03-306-13/+90
| | | | | | | | | | | | | | | | Add support for the new relocations and linking metadata section support in https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md. In particular, this allows LLVM to indicate which variable is the stack pointer, so that it can be linked with other objects. This also adds support for emitting type relocations for call_indirect instructions. Right now, this is mainly tested by using wabt and hexdump to examine the output on selected testcases. We'll add more tests as the design stablizes and more of the pieces are in place. llvm-svn: 299141
* AMDGPU: Rename isKernelMatt Arsenault2017-03-303-6/+22
| | | | | | | | What we really want to do is distinguish functions that may be called by other functions, and graphics shaders are not called kernels. llvm-svn: 299140
* AMDGPU: Add all atomicrmw fields to atomic.inc/decMatt Arsenault2017-03-301-2/+5
| | | | | | Add scope, order, isVolatile llvm-svn: 299122
* [AVX-512] Fix bad comment from r299112. NFCCraig Topper2017-03-301-1/+2
| | | | llvm-svn: 299114
* [AVX-512] Fix another case where fastisel was generating a GR8 to VK1 copy. ↵Craig Topper2017-03-301-2/+12
| | | | | | | | This time after calls returning i1. Fixes PR32472. llvm-svn: 299112
* [AMDGPU] Add GlobalOpt parameter to Always Inliner passStanislav Mekhanoshin2017-03-303-7/+11
| | | | | | | | | If set to false it does not remove global aliases. With this parameter set to false it should be safe to run the pass before link. Differential Revision: https://reviews.llvm.org/D31489 llvm-svn: 299108
* [AArch64ISelLowering] Remove `else` after `return` in LowerGlobalTLSAddress.Davide Italiano2017-03-301-1/+1
| | | | llvm-svn: 299103
* [AArch64] Simplify isSingExtended()/isZeroExtended(). NFCI.Davide Italiano2017-03-301-10/+4
| | | | llvm-svn: 299102
* Spelling mistakes in comments. NFCI.Simon Pilgrim2017-03-304-5/+5
| | | | | | Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072
* Spelling mistakes in comments. NFCI.Simon Pilgrim2017-03-301-15/+15
| | | | llvm-svn: 299069
* [X86IselLowering] Remove extraneous semicolon. NFCI.Davide Italiano2017-03-291-1/+1
| | | | | | Unbreaks the build with GCC -Werror. llvm-svn: 299030
* [X86] Tidied up comment - we don't custom lower add/sub i64 on i686 anymore. ↵Simon Pilgrim2017-03-291-1/+2
| | | | | | NFCI. llvm-svn: 299004
* Spelling mistakes in comments. NFCI.Simon Pilgrim2017-03-291-5/+5
| | | | llvm-svn: 299000
* [X86][AVX2] Prevent unary interleaving patterns from calling ↵Simon Pilgrim2017-03-291-3/+4
| | | | | | lowerVectorShuffleAsSplitOrBlend (PR32453) llvm-svn: 298993
* [AMDGPU] Tidy up ↵Simon Pilgrim2017-03-291-13/+6
| | | | | | | | computeKnownBitsForTargetNode/ComputeNumSignBitsForTargetNode arguments. NFCI. Based on comment in D31249. llvm-svn: 298991
* [X86] Removed old comment. NFCI.Simon Pilgrim2017-03-291-2/+1
| | | | | | No longer makes sense as the previous opcode mnemonic it was referring to is long gone. llvm-svn: 298988
* Move the x86 cpu feature rtm from Haswell to Skylake matching clang commit ↵Eric Christopher2017-03-291-1/+1
| | | | | | r298956. llvm-svn: 298986
* [AVX-512] Remove explicit KMOVWrk from isel patterns. COPY_TO_REGCLASS to ↵Craig Topper2017-03-291-8/+8
| | | | | | GR32 is enough. llvm-svn: 298985
* [AVX-512] Remove explicit KMOVWrk/KMOVWKr instructions from patterns where ↵Craig Topper2017-03-291-16/+12
| | | | | | | | we can just use COPY_TO_REGCLASS instead. This will result in a KMOVW or KMOVD being emitted during register allocation. And in at least some cases this might allow the register coalescer to remove the copy all together. llvm-svn: 298984
* [AVX-512] Punt on fast-isel of truncates to i1 when AVX512 is enabled.Craig Topper2017-03-281-1/+2
| | | | | | | | | | We should be masking the value and emitting a register copy like we do in non-fast isel. Instead we were just updating the value map and emitting nothing. After r298928 we started seeing cases where we would create a copy from GR8 to GR32 because the source register in a VK1 to GR32 copy was replaced by the GR8 going into a truncate. This fixes PR32451. llvm-svn: 298957
* [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64Guozhi Wei2017-03-283-19/+37
| | | | | | | | | | In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 298955
* [AMDGPU] Boost unroll threshold for loops reading local memoryStanislav Mekhanoshin2017-03-281-30/+72
| | | | | | | | | | | | | This is less important than increase threshold for private memory, but still brings performance improvements in a wide range of tests. Unrolling more for local memory serves three purposes: it allows to combine ds operations if offset becomes static, saves registers used for offsets in case of static offsets, and allows better lds latency hiding. Differential Revision: https://reviews.llvm.org/D31412 llvm-svn: 298948
* [AMDGPU] Fix recorded region boundaries in max-occupancy schedulerStanislav Mekhanoshin2017-03-282-17/+7
| | | | | | | | | | This is incorrect to record region boundaries before scheduling, it may change after scheduling. As a result second pass may see less instructions to schedule than it should. Differential Revision: https://reviews.llvm.org/D31434 llvm-svn: 298945
* [X86][MMX] Match MMX fp_to_sint conversions from XMM registersSimon Pilgrim2017-03-282-4/+25
| | | | | | | | | | We currently perform the various fp_to_sint XMM conversion and then transfer to the MMX register (on 32-bit via the stack). This patch improves support for MOVDQ2Q XMM to MMX transfers and adds the XMM->MMX fp_to_sint direct conversion patterns. The SSE2 specifications are the same as for XMM->XMM and XMM->MMX rounding/exceptions/etc. Differential Revision: https://reviews.llvm.org/D30868 llvm-svn: 298943
* [AMDGPU] Split -amdgpu-early-inline-all optionStanislav Mekhanoshin2017-03-281-3/+13
| | | | | | | | | | Previously it was covered by the internalization. It turns out we cannot run internalizer in FE, it break separate compilation tests. Thus early inliner gets its own option. Differential Revision: https://reviews.llvm.org/D31429 llvm-svn: 298935
* [x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equalitySanjay Patel2017-03-281-1/+5
| | | | | | | Follow-up to: https://reviews.llvm.org/rL298775 llvm-svn: 298933
* Revert "Dont emit Mapping symbols for sections that contain only data."Weiming Zhao2017-03-281-68/+14
| | | | | | | | It breaks some lld tests. This reverts commit 3a50eea6d9732ab40e9a7aebe6be777b53a8b35c. llvm-svn: 298932
* [X86][AVX2] Add support for combining v16i16 shuffles to VPBLENDWSimon Pilgrim2017-03-281-28/+47
| | | | llvm-svn: 298929
* [AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registersCraig Topper2017-03-283-53/+106
| | | | | | | | | | | | | | | | We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register. This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8. I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa. Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition. This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI. Differential Revision: https://reviews.llvm.org/D30968 llvm-svn: 298928
* [X86][SSE] Refactored shuffle BLEND combining to make future 16i16 support ↵Simon Pilgrim2017-03-281-34/+33
| | | | | | | | easier. NFCI. Call the matchVectorShuffleAsBlend test as early as possible. llvm-svn: 298925
* Fix signed/unsigned comparison warningSimon Pilgrim2017-03-281-2/+2
| | | | llvm-svn: 298917
* [X86][SSE] Begin merging vector shuffle to BLEND for lowering and combining.Simon Pilgrim2017-03-281-70/+82
| | | | | | Split off matchVectorShuffleAsBlend from lowerVectorShuffleAsBlend for reuse in combining. llvm-svn: 298914
* Wdocumentation fixSimon Pilgrim2017-03-281-1/+0
| | | | llvm-svn: 298911
* [X86][SSE] Set second operand to undef instead of first operand in unary ↵Simon Pilgrim2017-03-281-1/+2
| | | | | | | | shuffle combines. Copy isn't necessary after the matchVectorShuffleWithUNPCK refactor and undef value will make some future undef/zero handling easier. llvm-svn: 298910
* Strip trailing whitespaceSimon Pilgrim2017-03-281-1/+1
| | | | llvm-svn: 298909
* [AArch64] [Assembler] option to disable negative immediate conversionsSanne Wouda2017-03-284-10/+30
| | | | | | | | | | | | | | | | | Summary: Similar to the ARM target in https://reviews.llvm.org/rL298380, this patch adds identical infrastructure for disabling negative immediate conversions, and converts the existing aliases to the new infrastucture. Reviewers: rengolin, javed.absar, olista01, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D31243 llvm-svn: 298908
* [GlobalISel][X86] support G_FRAME_INDEX instruction selection.Igor Breger2017-03-282-22/+102
| | | | | | | | | | | | | | | | Summary: G_LOAD/G_STORE, add alternative RegisterBank mapping. For G_LOAD, Fast and Greedy mode choose the same RegisterBank mapping (GprRegBank ) for the G_GLOAD + G_FADD , can't get rid of cross register bank copy GprRegBank->VecRegBank. Reviewers: zvi, rovka, qcolombet, ab Reviewed By: zvi Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30979 llvm-svn: 298907
* [AMDGPU] Update SI scheduler colorHighLatenciesGroupsValery Pykhtin2017-03-282-22/+100
| | | | | | | | | | Depends on rL298896: MachineScheduler/ScheduleDAG: Add support for GetSubGraph Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30152 llvm-svn: 298902
* Dont emit Mapping symbols for sections that contain only data.Weiming Zhao2017-03-281-14/+68
| | | | | | | | | | | | | | | | | Summary: Dont emit mapping symbols for sections that contain only data. Patched by Shankar Easwaran <shankare@codeaurora.org> Reviewers: rengolin, peter.smith, weimingz, kparzysz, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D30724 llvm-svn: 298901
* Remove an oddly unnecessary temporary.Eric Christopher2017-03-271-2/+1
| | | | llvm-svn: 298888
* Improve machine schedulers for in-order processorsJaved Absar2017-03-271-1/+4
| | | | | | | | | | | This patch enables schedulers to specify instructions that cannot be issued with any other instructions. It also fixes BeginGroup/EndGroup. Reviewed by: Andrew Trick Differential Revision: https://reviews.llvm.org/D30744 llvm-svn: 298885
* [AMDGPU] SISched: Detect dependency types between blocksValery Pykhtin2017-03-272-26/+39
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30153 llvm-svn: 298872
* [GlobalISel][AArch64] Extract a variable out of an NDEBUG block. NFC.Ahmed Bougacha2017-03-271-2/+2
| | | | | | r298863 used PtrReg, but that's never defined in release builds. Fix it. llvm-svn: 298869
* [GlobalISel][AArch64] Fold FI into LDR/STR ui addressing mode.Ahmed Bougacha2017-03-271-0/+5
| | | | | | | | A majority of loads and stores at O0 access an alloca. It's trivial to fold the G_FRAME_INDEX into the instruction; do it. llvm-svn: 298864
* [GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode.Ahmed Bougacha2017-03-271-1/+19
| | | | | | | | | | We're not to the point of supporting the load/store patterns yet (because they extensively use PatFrags). But in the meantime, we can implement some of the simplest addressing modes. llvm-svn: 298863
* [GlobalISel][AArch64] Select store of zero to WZR/XZR.Ahmed Bougacha2017-03-271-0/+11
| | | | | | These occur very frequently, and are quite trivial to catch. llvm-svn: 298862
* [AMDGPU] SISched: Update colorEndsAccordingToDependenciesValery Pykhtin2017-03-271-0/+14
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30150 llvm-svn: 298861
* [AMDGPU] Fix SI scheduler LiveOut Refcount issueValery Pykhtin2017-03-272-0/+26
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30145 llvm-svn: 298857
OpenPOWER on IntegriCloud