summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Support] Add a function to check if a file resides locally.Zachary Turner2017-02-213-21/+153
| | | | | | Differential Revision: https://reviews.llvm.org/D30010 llvm-svn: 295768
* Make default value for disable-licm-promotion in licm explicit.Xin Tong2017-02-211-1/+2
| | | | llvm-svn: 295767
* Don't modify archive members unless really needed.Rafael Espindola2017-02-211-18/+31
| | | | | | | | | | | For whatever reason ld64 requires that member headers (not the member themselves) should be aligned. The only way to do that is to edit the previous member so that it ends at an aligned boundary. Since modifying data put in an archive is an undesirable property, llvm-ar should only do it when it is absolutely necessary. llvm-svn: 295765
* Fix PR31896.Evgeniy Stepanov2017-02-211-5/+8
| | | | | | Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset). llvm-svn: 295762
* Try to fix line endings.Zachary Turner2017-02-211-457/+457
| | | | llvm-svn: 295759
* [InstCombine] canonicalize non-obivous forms of integer min/maxSanjay Patel2017-02-211-17/+24
| | | | | | | | | | | | | | | | This is part of trying to clean up our handling of min/max patterns in IR. By converting these to canonical form, we're more likely to recognize them because there are various places in InstCombine that don't use matchSelectPattern or m_SMax and friends. The backend fixups referenced in the now deleted TODO comment were added with: https://reviews.llvm.org/rL291392 https://reviews.llvm.org/rL289738 If there's any codegen fallout from this change, we should be able to address it in DAGCombiner or target-specific lowering. llvm-svn: 295758
* Remove svn:eol-style property from 2 files.Zachary Turner2017-02-211-457/+457
| | | | | | There are still over 3400 files remaining with this property set, but there are tens of thousands more with the property not set. Until we decide what to do on a global scale, this at least unblocks me temporarily. llvm-svn: 295756
* AMDGPU: Remove llvm.AMDGPU.flbit intrinsicMatt Arsenault2017-02-212-4/+0
| | | | llvm-svn: 295754
* AMDGPU: Don't use stack space for SGPR->VGPR spillsMatt Arsenault2017-02-218-90/+225
| | | | | | | | | | | | | | | | Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
* [LoopSimplify] Simplify how we compute UniqueExitXin Tong2017-02-211-8/+1
| | | | | | | | | | | | Summary: Simplify how we compute UniqueExit. Reuse ExitBlockSet. Reviewers: sanjoy, efriedma, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30182 llvm-svn: 295751
* Teach the IR verifier to reject conflicting debug info for function arguments.Adrian Prantl2017-02-211-0/+38
| | | | | | | | | | | Conflicting debug info for function arguments causes hard-to-debug assertions in the DWARF backend, so the Verifier should reject it. For performance reasons this only checks function arguments from non-inlined debug intrinsics for now. rdar://problem/30520286 llvm-svn: 295749
* [CodeGenPrepare] Sink and duplicate more 'and' instructions.Geoff Berry2017-02-216-81/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 llvm-svn: 295746
* [X86] EltsFromConsecutiveLoads SDLoc argument should be const&.Simon Pilgrim2017-02-211-1/+1
| | | | | | There appears never to have been a time that the reference was updated. llvm-svn: 295739
* Do not leak OpenedHandles.Vassil Vassilev2017-02-212-7/+4
| | | | | | Reviewed by Vedant Kumar (D30178) llvm-svn: 295737
* [X86][AVX2] Fix VPBROADCASTQ folding on 32-bit targets.Simon Pilgrim2017-02-212-0/+16
| | | | | | As i64 isn't a value type on 32-bit targets, we need to fold the VZEXT_LOAD into VPBROADCASTQ. llvm-svn: 295733
* [ARM] Correct SP/PC handling in t2MOVrJohn Brawn2017-02-212-4/+20
| | | | | | | | | | PC isn't allowed in the source operand of t2MOVr, so change the register class to one without PC. SP handling is slightly trickier and changes depending on if we're in ARMv8, so do that in checkTargetMatchPredicate. Differential Revision: https://reviews.llvm.org/D30199 llvm-svn: 295732
* [X86][SSE] Prefer to combine shuffles to VZEXT over VZEXT_MOVL.Simon Pilgrim2017-02-211-9/+9
| | | | | | This matches what is already done during shuffle lowering and helps prevent the need for a zero-vector in cases where shuffles match both patterns. llvm-svn: 295723
* [InstCombine] Do not exercise nested max/min pattern on absAnna Thomas2017-02-211-1/+3
| | | | | | | | | | | | | | | | | | | Summary: This is a fix for assertion failure in `getInverseMinMaxSelectPattern` when ABS is passed in as a select pattern. We should not be invoking the simplification rule for ABS(MIN(~ x,y))) or ABS(MAX(~x,y)) combinations. Added a test case which would cause an assertion failure without the patch. Reviewers: sanjoy, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30051 llvm-svn: 295719
* [AVX512] Fix EXTRACT_VECTOR_ELT for v2i1/v4i1/v32i1/v64i1 with variable index.Igor Breger2017-02-211-3/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D30189 llvm-svn: 295718
* [ARM] GlobalISel: Lower calls to void() functionsDiana Picus2017-02-212-0/+39
| | | | | | | For now, we hardcode a BLX instruction, and generate an ADJCALLSTACKDOWN/UP pair with amount 0. llvm-svn: 295716
* The patch introduces new way of narrowing complex (>UINT16 variants) solutions.Evgeny Stupachenko2017-02-211-1/+159
| | | | | | | | | | | | | | | | | | | The new method introduced under "-lsr-exp-narrow" option (currenlty set to true). Summary: The method is based on registers number mathematical expectation and should be generally closer to optimal solution. Please see details in comments to "LSRInstance::NarrowSearchSpaceByDeletingCostlyFormulas()" function (in lib/Transforms/Scalar/LoopStrengthReduce.cpp). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D29862 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 295704
* [X86] Use SHLD with both inputs from the same register to implement rotate ↵Craig Topper2017-02-215-1/+26
| | | | | | | | | | | | | | | | | | | on Sandy Bridge and later Intel CPUs Summary: Sandy Bridge and later CPUs have better throughput using a SHLD to implement rotate versus the normal rotate instructions. Additionally it saves one uop and avoids a partial flag update dependency. This patch implements this change on any Sandy Bridge or later processor without BMI2 instructions. With BMI2 we will use RORX as we currently do. Reviewers: zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30181 llvm-svn: 295697
* [X86] Fix formatting. NFCCraig Topper2017-02-211-1/+1
| | | | llvm-svn: 295695
* [AVX-512] Use sse_load_f32/f64 in place of scalar_to_vector and scalar load ↵Craig Topper2017-02-211-15/+18
| | | | | | in some patterns. llvm-svn: 295693
* [AVX-512] Fix the ExeDomain for vcmpss/vcmpsd.Craig Topper2017-02-211-0/+2
| | | | llvm-svn: 295691
* [ValueTracking] clang-format a section I'm about to touch; NFCSanjoy Das2017-02-211-64/+64
| | | | | | (Whitespace only change) llvm-svn: 295690
* ScheduleDAG: Cleanup; NFCMatthias Braun2017-02-211-184/+133
| | | | | | | | | - Fix doxygen comments (do not repeat documented name, remove definition comment if there is already one at the declaration, add \p, ...) - Add some const modifiers - Use range based for llvm-svn: 295688
* SubtargetFeature: Cleanup; NFCMatthias Braun2017-02-211-65/+31
| | | | | | | | | - Fix doxygen comments - Remove duplicated comments - Remove section comments (which became wrong over time) - Use more `const` and references but less `auto` llvm-svn: 295687
* Add a wrapper around copy_if in STLExtras; NFCSanjoy Das2017-02-214-42/+38
| | | | | | I will add one more use for this in a later change. llvm-svn: 295685
* [BranchFolding] Update debug location along with the update of branch ↵Taewook Oh2017-02-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | instruction. Summary: Currently, BranchFolder drops DebugLoc for branch instructions in some places. For example, for the test code attached, the branch instruction of 'entry' block has a DILocation of ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` , but this information is gone when then block is lowered because BranchFolder misses it. This patch is a fix for this issue. Reviewers: qcolombet, aprantl, craig.topper, MatzeB Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29902 llvm-svn: 295684
* [IndVars] Add an assertSanjoy Das2017-02-201-0/+3
| | | | | | | We've already checked that the loop is in simplify form before, but a little paranoia never hurt anyone. llvm-svn: 295680
* [IR/Verifier] List the CU we weren't able to find in `llvm.dbg.cu`.Davide Italiano2017-02-201-4/+2
| | | | llvm-svn: 295678
* MemorySSA: Add support for renaming uses in the updater.Daniel Berlin2017-02-202-25/+68
| | | | | | | | | | | | | | Summary: This lets one add aliasing stores to the updater. (i'm next going to move the creation/etc functions to the updater) Reviewers: george.burgess.iv Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30154 llvm-svn: 295677
* [AVX-512] Add a few more patterns for selecting masked vpternlog with ↵Craig Topper2017-02-201-0/+25
| | | | | | broadcast loads where the passthru operand is not operand 0. llvm-svn: 295673
* [X86] Tidyup combineExtractVectorElt. NFCI.Simon Pilgrim2017-02-201-8/+9
| | | | | | | | Pull out repeated code for extraction index operand and source vector value type. Use isNullConstant helper to check for zero extraction index. llvm-svn: 295670
* [ARM] GlobalISel: Don't select atomic loadsDiana Picus2017-02-201-0/+6
| | | | | | | | | | | | | | | There used to be a check in the IRTranslator that prevented us from having to deal with atomic loads/stores. That check has been removed in r294993 and the AArch64 backend was updated accordingly. This commit does the same thing for the ARM backend. In general, in the ARM backend we introduce fences during the atomic expand pass, so we don't have to worry about atomics, *except* for the 32-bit ARMv8 target, which handles atomics more like AArch64. Since we don't want to worry about that yet, just bail out of instruction selection if we find any atomic loads. llvm-svn: 295662
* [X86] Fix EXTRACT_VECTOR_ELT with variable index from v32i16 and v64i8 vector.Igor Breger2017-02-205-49/+30
| | | | | | | | | | | | Its more profitable to go through memory (1 cycles throughput) than using VMOVD + VPERMV/PSHUFB sequence ( 2/3 cycles throughput) to implement EXTRACT_VECTOR_ELT with variable index. IACA tool was used to get performace estimation (https://software.intel.com/en-us/articles/intel-architecture-code-analyzer) For example for var_shuffle_v16i8_v16i8_xxxxxxxxxxxxxxxx_i8 test from vector-shuffle-variable-128.ll I get 26 cycles vs 79 cycles. Removing the VINSERT node, we don't need it any more. Differential Revision: https://reviews.llvm.org/D29690 llvm-svn: 295660
* [X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLXSimon Pilgrim2017-02-202-1/+28
| | | | | | | | Use v8i64 ASHR instructions if we don't have VLX. Differential Revision: https://reviews.llvm.org/D28537 llvm-svn: 295656
* Strip trailing whitespace.Simon Pilgrim2017-02-201-1/+1
| | | | llvm-svn: 295653
* [SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes.Simon Pilgrim2017-02-202-0/+34
| | | | | | Thanks to Mikael Holmén for the initial test case llvm-svn: 295652
* AArch64AsmParser: tablegen the isBranchTarget helper functionsSjoerd Meijer2017-02-202-37/+18
| | | | | | | | | Use tablegen to autogenerate isBranchtarget helper functions. This is a cleanup that removes almost identical functions that differ only in a few constants. Differential Revision: https://reviews.llvm.org/D30160 llvm-svn: 295649
* [X86][AVX] Extend hasVEX_WPrefix bit to accept WIG value (W Ignore) + update ↵Ayman Musa2017-02-202-304/+306
| | | | | | | | | | | all AVX instructions with the new value. Add WIG value to all of AVX instructions which ignore the W-bit in their encoding, instead of giving them the default value of 0. This patch is needed for a follow up work on EVEX2VEX pass (replacing EVEX encoded instructions with their corresponding VEX version when possible). Differential Revision: https://reviews.llvm.org/D29876 llvm-svn: 295643
* [SLP] nullptr'ize initial value in `findBuildAggregate()`, NFC.Alexey Bataev2017-02-201-1/+1
| | | | | | Initial value of V is sett nullptr, as it is not used. llvm-svn: 295642
* [SLP] Rework `findBuildAggregate()` from ercursive form to iterative, NFC.Alexey Bataev2017-02-201-9/+12
| | | | | | | | | | Reviewers: mkuper Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30103 llvm-svn: 295641
* [AVX-512] Add more patterns to fold masked VPTERNLOG with load when the ↵Craig Topper2017-02-201-0/+50
| | | | | | passthru isn't operand 0. llvm-svn: 295640
* [AVX-512] Fix mistake in the immediate swizzle for some of the VPTERNLOG ↵Craig Topper2017-02-201-2/+2
| | | | | | patterns. llvm-svn: 295638
* [Orc] Rename ObjectLinkingLayer -> RTDyldObjectLinkingLayer.Lang Hames2017-02-202-6/+6
| | | | | | | | | | The current ObjectLinkingLayer (now RTDyldObjectLinkingLayer) links objects in-process using MCJIT's RuntimeDyld class. In the near future I hope to add new object linking layers (e.g. a remote linking layer that links objects in the JIT target process, rather than the client), so I'm renaming this class to be more descriptive. llvm-svn: 295636
* [AVX-512] Add more VPTERNLOG patterns to enable folding of broadcast loads ↵Craig Topper2017-02-201-0/+39
| | | | | | that aren't in operand 2. llvm-svn: 295634
* [X86] Use memory form of shift right by 1 when the rotl immediate is one ↵Craig Topper2017-02-201-4/+4
| | | | | | | | less than the operation size. An earlier commit already did this for the register form. llvm-svn: 295626
* [AVX-512] Remove AddedComplexity from masked operations. The size of the ↵Craig Topper2017-02-191-32/+16
| | | | | | patterns already increases their priority. llvm-svn: 295619
OpenPOWER on IntegriCloud