summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in ↵Michael Zuckerman2017-10-021-109/+169
| | | | | | | | | | | | | | | | | | | X86InterleavedAccess (VF64 stride 3-4) I continue to support different VF interleaved and in this pass for this patch, I added the vf64 stride3 support for both load and store. I also added support fot the stride4 store. Reviewers: 1. zvi 2. dorit 3. igorb 4. guyblank Differential Revision: https://reviews.llvm.org/D37687 Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b llvm-svn: 314651
* [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.Craig Topper2017-10-021-1/+1
| | | | | | The 4th operand was not being constrained and the third operand was being constrained twice. llvm-svn: 314648
* [X86] Use a bool flag instead of assigning an unsigned to two different ↵Craig Topper2017-10-021-9/+8
| | | | | | values that we only use in an equality comparison. llvm-svn: 314647
* [X86] Use _NOREX MOVZX instructions for some patterns even in 32-bit mode.Craig Topper2017-10-021-32/+6
| | | | | | This unifies the patterns between both modes. This should be effectively NFC since all the available registers in 32-bit mode statisfy this constraint. llvm-svn: 314643
* [Hexagon] Check vector elements for equivalence in the ↵Ron Lieberman2017-10-021-1/+16
| | | | | | | | | | | | | HexagonVectorLoopCarriedReuse pass If the two instructions being compared for equivalence have corresponding operands that are integer constants, then check their values to determine equivalence. Patch by Suyog Sarda! llvm-svn: 314642
* [Hexagon] Patch to Extract i1 element from vector of i1Ron Lieberman2017-10-021-1/+7
| | | | | | | This patch extracts 1 element from vector consisting of elements of size 1 bit at given index. llvm-svn: 314641
* [InstCombine] Use APInt for all the math in foldICmpDivConstantCraig Topper2017-10-011-95/+46
| | | | | | | | | | | | | | Summary: This currently uses ConstantExpr to do its math, but as noted in a TODO it can all be done directly on APInt. Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38440 llvm-svn: 314640
* [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMemCraig Topper2017-10-018-34/+33
| | | | | | | | | | | | | | | | | | | Summary: Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable. For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem. I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995. Reviewers: aymanmus, RKSimon, zvi Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38120 llvm-svn: 314639
* [X86] Remove a couple unnecessary COPY_TO_REGCLASS from some output patterns ↵Craig Topper2017-10-011-9/+7
| | | | | | where the instruction already produces the correct register class. llvm-svn: 314638
* [X86][SSE] Add faux shuffle combining support for PACKUSSimon Pilgrim2017-10-011-4/+15
| | | | llvm-svn: 314631
* [X86][SSE] Improve shuffle combining of PACKSS instructions.Simon Pilgrim2017-10-011-6/+24
| | | | | | Support unary packing and fix the faux shuffle mask for vectors larger than 128 bits. llvm-svn: 314629
* [x86] formatting; NFCSanjay Patel2017-10-011-4/+2
| | | | llvm-svn: 314627
* Revert r314579: "Recommi r314561 after fixing over-debug assertion".Daniel Jasper2017-10-012-225/+0
| | | | | | | | | | | | And follow-up r314585. Leads to segfaults. I'll forward reproduction instructions to the patch author. Also, for a recommit, still add the original patch description. Otherwise, it becomes really tedious to find out what a patch actually does. The fact that it is a recommit with a fix is somewhat secondary. llvm-svn: 314622
* Separate the logic when handling indirect calls in SamplePGO ThinLTO compile ↵Dehao Chen2017-10-012-13/+28
| | | | | | | | | | | | | | | | phase and other phases. Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D38094 llvm-svn: 314619
* Revert "Fix typo [NFC]"Xin Tong2017-10-011-6/+3
| | | | | | | | This reverts commit e60b5028619be1c81bd039d63a0627dac32d38f9. Incorrectly include changes that are not typo fix. llvm-svn: 314614
* Fix typo [NFC]Xin Tong2017-10-011-3/+6
| | | | llvm-svn: 314613
* NewGVN: Fix PR 34473, by not using ExactlyEqualsExpression for findingDaniel Berlin2017-09-301-6/+6
| | | | | | phi of ops users. llvm-svn: 314612
* NewGVN: Evaluate phi of ops expressions before creating phi nodeDaniel Berlin2017-09-301-48/+72
| | | | llvm-svn: 314611
* NewGVN: Allow dependent PHI of opsDaniel Berlin2017-09-301-57/+100
| | | | llvm-svn: 314610
* NewGVN: Make OpIsSafeForPhiOfOps non-recursiveDaniel Berlin2017-09-301-7/+38
| | | | llvm-svn: 314609
* Refactor the SamplePGO profile annotation logic to extract ↵Dehao Chen2017-09-301-58/+65
| | | | | | inlineCallInstruction. (NFC) llvm-svn: 314601
* [X86][SSE] Fold (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1Simon Pilgrim2017-09-301-0/+9
| | | | | | Remove sign extend in register style pattern if the sign is already extended enough llvm-svn: 314599
* [AVX-512] Add patterns to make fp compare instructions commutable during isel.Craig Topper2017-09-302-1/+90
| | | | llvm-svn: 314598
* Code refactoring for the interleaved code <NFC>Michael Zuckerman2017-09-301-28/+18
| | | | | Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1 llvm-svn: 314596
* Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass"Daniel Jasper2017-09-303-253/+65
| | | | | | | Causes a segfault on a builtbot (and in our internal bootstrapping of Clang). See Eli's response on the commit thread. llvm-svn: 314589
* Fix buildbot failure -- tighten type check for matching phiXinliang David Li2017-09-301-1/+1
| | | | llvm-svn: 314585
* [X86] Support v64i8 mulhu/mulhsCraig Topper2017-09-301-1/+9
| | | | | | | | Implemented by splitting into two v32i8 mulhu/mulhs and concatenating the results. Differential Revision: https://reviews.llvm.org/D38307 llvm-svn: 314584
* Recommi r314561 after fixing over-debug assertionXinliang David Li2017-09-302-0/+225
| | | | llvm-svn: 314579
* [AMDGPU] Set fast-math flags on functions given the optionsStanislav Mekhanoshin2017-09-293-7/+36
| | | | | | | | | | | | | | | | We have a single library build without relaxation options. When inlined library functions remove fast math attributes from the functions they are integrated into. This patch sets relaxation attributes on the functions after linking provided corresponding relaxation options are given. Math instructions inside the inlined functions remain to have no fast flags, but inlining does not prevent fast math transformations of a surrounding caller code anymore. Differential Revision: https://reviews.llvm.org/D38325 llvm-svn: 314568
* CodeGen: Fix pointer info in expandUnalignedLoad/StoreYaxun Liu2017-09-291-12/+21
| | | | | | | | | | | | Currently expandUnalignedLoad/Store uses place holder pointer info for temporary memory operand in stack, which does not have correct address space. This causes unaligned private double16 load/store to be lowered to flat_load instead of buffer_load for amdgcn target. This fixes failures of OpenCL conformance test basic/vload_private/vstore_private on target amdgcn---amdgizcl. Differential Revision: https://reviews.llvm.org/D35361 llvm-svn: 314566
* Revert 314561 due to debug build assertion failureXinliang David Li2017-09-292-223/+0
| | | | llvm-svn: 314563
* Eliminate PHI (int typed) which has only one use by intptrXinliang David Li2017-09-292-0/+223
| | | | | | | | | | This patch will eliminate redundant intptr/ptrtoint that pessimizes analyses such as SCEV, AA and will make optimization passes such as auto-vectorization more powerful. Differential revision: http://reviews.llvm.org/D37832 llvm-svn: 314561
* Revert "Use the basic cost if a GEP is not used as addressing mode"Alex Shlyapnikov2017-09-293-7/+2
| | | | | | | | | | | | | | | | | | | | | | | This reverts commit r314517. This commit crashes sanitizer bots, for example: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167 Stack snippet: ... /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0 ... llvm-svn: 314560
* [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-09-2910-137/+204
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 314559
* [LV] Use correct insertion point when type shrinking reductionsMatthew Simpson2017-09-291-1/+2
| | | | | | | | | | | | | When type shrinking reductions, we should insert the truncations and extends at the end of the loop latch block. Previously, these instructions were inserted at the end of the loop header block. The difference is only a problem for loops with predicated instructions (e.g., conditional stores and instructions that may divide by zero). For these instructions, we create new basic blocks inside the vectorized loop, which cause the loop header and latch to no longer be the same block. This should fix PR34687. Reference: https://bugs.llvm.org/show_bug.cgi?id=34687 llvm-svn: 314542
* [WebAssembly] Allow each data segment to specify its own alignmentSam Clegg2017-09-293-23/+31
| | | | | | | | | Also, add a flags field as we will almost certainly be needing that soon too. Differential Revision: https://reviews.llvm.org/D38296 llvm-svn: 314534
* [SimplifyIndVar] Do not fail when we constant fold an IV user to ↵Hongbin Zheng2017-09-291-10/+17
| | | | | | | | | | ConstantPointerNull The type of a SCEVConstant may not match the corresponding LLVM Value. In this case, we skip the constant folding for now. TODO: Replace ConstantInt Zero by ConstantPointerNull llvm-svn: 314531
* [dwarfdump][NFC] Consistent printing of address rangesJonas Devlieghere2017-09-292-8/+8
| | | | | | | | | | | | | | This implement the insertion operator for DWARF address ranges so they are consistently printed as [LowPC, HighPC). While a dump method might have felt more consistent, it is used exclusively for printing error messages in the verifier and never used for actual dumping. Hence this approach is more intuitive and creates less clutter at the call sites. Differential revision: https://reviews.llvm.org/D38395 llvm-svn: 314523
* AMDGPU: VALU carry-in and v_cndmask condition cannot be EXECNicolai Haehnle2017-09-295-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The hardware will only forward EXEC_LO; the high 32 bits will be zero. Additionally, inline constants do not work. At least, v_addc_u32_e64 v0, vcc, v0, v1, -1 which could conceivably be used to combine (v0 + v1 + 1) into a single instruction, acts as if all carry-in bits are zero. The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine s_mov_b64 s[0:1], exec v_cndmask_b32_e64 v0, v1, v2, s[0:1] into v_mov_b32 v0, v3 but it's not particularly high priority. Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.* llvm-svn: 314522
* Use the basic cost if a GEP is not used as addressing modeJun Bum Lim2017-09-293-2/+7
| | | | | | | | | | | | | | | | | | | Summary: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng Reviewed By: hfinkel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38085 llvm-svn: 314517
* [SystemZ] implement shouldCoalesce()Jonas Paulsson2017-09-297-5/+91
| | | | | | | | | | | | | | | | | | | Implement shouldCoalesce() to help regalloc avoid running out of GR128 registers. If a COPY involving a subreg of a GR128 is coalesced, the live range of the GR128 virtual register will be extended. If this happens where there are enough phys-reg clobbers present, regalloc will run out of registers (if there is not a single GR128 allocatable register available). This patch tries to allow coalescing only when it can prove that this will be safe by checking the (local) interval in question. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D37899 https://bugs.llvm.org/show_bug.cgi?id=34610 llvm-svn: 314516
* [X86] Improve codegen for inverted overflow checking intrinsics.Amara Emerson2017-09-291-0/+20
| | | | | | | | Adds a new combine for: xor(setcc cc, val), 1 --> setcc (invert(cc), val) Differential Revision: https://reviews.llvm.org/D38161 llvm-svn: 314514
* [ARM] v8.3-a complex number supportSam Parker2017-09-297-2/+298
| | | | | | | | | | | | | | | New instructions are added to AArch32 and AArch64 to aid floating-point multiplication and addition of complex numbers, where the complex numbers are packed in a vector register as a pair of elements. The Imaginary part of the number is placed in the more significant element, and the Real part of the number is placed in the less significant element. This patch adds assembler for the ARM target. Differential Revision: https://reviews.llvm.org/D36789 llvm-svn: 314511
* Small modification <NFC>Michael Zuckerman2017-09-291-1/+1
| | | | | Change-Id: I360abccee12cae29bd2ac4f8399c9ecc92eb7f13 llvm-svn: 314510
* [mips] Reordering callseq* nodes to be linearAleksandar Beserminji2017-09-292-26/+27
| | | | | | | | | | | | | Fix nested callseq* nodes by moving callseq_start after the arguments calculation to temporary registers, so that callseq* nodes in resulting DAG are linear. Recommitting r314497. This version does not contain test which fails when compiler is not build in debug mode. Differential Revision: https://reviews.llvm.org/D37328 llvm-svn: 314507
* Revert "[mips] Reordering callseq* nodes to be linear"Aleksandar Beserminji2017-09-292-27/+26
| | | | | | | | | Added test relies on the compiler being built in debug mode, which may not be the case. This reverts commit r314497. llvm-svn: 314506
* [mips] Add missing license info, formatting changes. NFCISimon Dardis2017-09-291-30/+47
| | | | | | | | Add missing license information to MicroMipsInstrFPU.td and fix most of the formatting errors present. Others will be addressed in a follow up commits. llvm-svn: 314505
* [AMDGPU] calling conventions for AMDPAL OS typeTim Renouf2017-09-2910-2/+31
| | | | | | | | | | | | | | | Summary: This commit adds comments on how the AMDPAL OS type overloads the existing AMDGPU_ calling conventions used by Mesa, and adds a couple of new ones. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37752 llvm-svn: 314502
* [AMDGPU] AMDPAL scratch buffer supportTim Renouf2017-09-295-12/+95
| | | | | | | | | | | | | | | | | | | | | | | Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed. With amdpal, the scratch resource descriptor is loaded from offset 0 of the global information table. The low 32 bits of the address of the global information table is passed in s0. Added amdgpu-git-ptr-high function attribute to hard-wire the high 32 bits of the address of the global information table. If the function attribute is not specified, or is 0xffffffff, then the backend generates code to use the high 32 bits of pc. The documentation for the AMDPAL ABI will be added in a later commit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye Differential Revision: https://reviews.llvm.org/D37483 llvm-svn: 314501
* [Triple] Add AMDPAL operating system typeTim Renouf2017-09-292-0/+6
| | | | | | | | | | | | | | | | | | Summary: This operating system type represents the AMDGPU PAL runtime, and will be required by the AMDGPU backend in order to generate correct code for this runtime. Currently it generates the same code as not specifying an OS at all. That will change in future commits. Patch from Tim Corringham. Subscribers: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D37380 llvm-svn: 314500
OpenPOWER on IntegriCloud