summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] FullFP16 LowerReturn FixSjoerd Meijer2018-02-011-2/+2
| | | | | | | | | | Commit r323512 introduced an optimisation in LowerReturn for half-precision return values. A missing check caused a crash when the return value is "undef" (i.e. a node that has no operands). Differential Revision: https://reviews.llvm.org/D42743 llvm-svn: 323968
* Revert commit rL323951David Green2018-02-011-6/+2
| | | | | | | Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959
* [mips] Include EVA instructions in Std2MicroMips mapping tablesAleksandar Beserminji2018-02-014-23/+38
| | | | | | | | | This patch includes EVA instructions in the Std2MicroMips mapping tables, which is required for direct object emission. Differential Revision: https://reviews.llvm.org/D41771 llvm-svn: 323958
* [AArch64][NFC] Make all ProcResource definitions include their SchedModel.Clement Courbet2018-02-013-42/+35
| | | | | | | This makes targets ExynosM1,ExynosM3,ThunderX2T99 consistent with all other targets. llvm-svn: 323955
* [ARM] Add support for unpredictable MVN instructions.Yvan Roux2018-02-011-2/+8
| | | | | | | | | | | | | | | This fixes bugzilla 33011 https://bugs.llvm.org/show_bug.cgi?id=33011 Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in sections A8.8.116 and A8.8.117. It fixes also the usage of PC register as destination register for MVN register-shifted register version as specified in A8.8.117. Differential Revision: https://reviews.llvm.org/D41905 llvm-svn: 323954
* [InstCombine] Allow common type conversions to i8/i16/i32David Green2018-02-011-2/+6
| | | | | | | | | | | This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951
* Test commit: Fix a comment.Yvan Roux2018-02-011-1/+1
| | | | llvm-svn: 323947
* [LSR] Don't force bases of foldable formulae to the final type.Mikael Holmen2018-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946
* [XRay][compiler-rt+llvm] Update XRay register stashing semanticsDean Michael Berris2018-02-013-2/+17
| | | | | | | | | | | | | | | | | | | | | | Summary: This change expands the amount of registers stashed by the entry and `__xray_CustomEvent` trampolines. We've found that since the `__xray_CustomEvent` trampoline calls can show up in situations where the scratch registers are being used, and since we don't typically want to affect the code-gen around the disabled `__xray_customevent(...)` intrinsic calls, that we need to save and restore the state of even the scratch registers in the handling of these custom events. Reviewers: pcc, pelikan, dblaikie, eizan, kpw, echristo, chandlerc Reviewed By: echristo Subscribers: chandlerc, echristo, hiraditya, davide, dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D40894 llvm-svn: 323940
* [MC] Fix assembler infinite loop on EH table using LEB padding.Rafael Espindola2018-02-011-2/+6
| | | | | | | | | | | | | Fix the infinite loop reported in PR35809. It can occur with GCC-style EH table assembly, where the compiler relies on the assembler to calculate the offsets in the EH table. Also see https://sourceware.org/bugzilla/show_bug.cgi?id=4029 for the equivalent issue in the GNU assembler. Patch by Ryan Prichard! llvm-svn: 323934
* [GlobalOpt] Improve common case efficiency of static global initializer ↵Amara Emerson2018-01-311-2/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | evaluation For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T *r = &q; T q = 42; U *p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S*, %struct.S, %struct.U* } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933
* DAG: Fix not truncating when promoting bswap/bitreverseMatt Arsenault2018-01-311-1/+2
| | | | | | | These need to convert back to the original type, like any other promotion. llvm-svn: 323932
* Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using ↵Evgeniy Stepanov2018-01-311-20/+0
| | | | | | | | | | bit-operations" Miscompiles code. Testcase pending. This reverts commit r323869. llvm-svn: 323929
* Utils: Fix DomTree update for entry blockMatt Arsenault2018-01-311-5/+14
| | | | | | | If SplitBlockPredecessors was used on a function entry block, it wouldn't update the dominator tree. llvm-svn: 323928
* AMDGPU: Fix missing SCC def from s_xor_b64_termMatt Arsenault2018-01-311-0/+1
| | | | llvm-svn: 323927
* [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf ↵Amjad Aboud2018-01-311-4/+12
| | | | | | | | | | | node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926
* [X86] Make the type checks in detectAVX512USatPattern more robustCraig Topper2018-01-311-7/+9
| | | | | | | | | | This code currently uses isSimple and getSizeInBits in an attempt to prune types. But isSimple will return true for any type that any target supports natively. I don't think that's a good way to prune types. I also don't think the dest element type checks are very robust since we didn't do an isSimple check on the dest type. This patch adds a check for the input type being legal to the one caller that didn't already check that. Then we explicitly check the element types for the destination are i8, i16, or i32 Differential Revision: https://reviews.llvm.org/D42706 llvm-svn: 323924
* Followup on Proposal to move MIR physical register namespace to '$' sigil.Puyan Lotfi2018-01-312-10/+19
| | | | | | | | | | | | Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922
* [Hexagon] Rename HexagonISelLowering::getNode to getInstr, NFCKrzysztof Parzyszek2018-01-313-56/+56
| | | | llvm-svn: 323916
* [x86] Make the retpoline thunk insertion a machine function pass.Chandler Carruth2018-01-312-51/+86
| | | | | | | | | | | | | | | | | | | | Summary: This removes the need for a machine module pass using some deeply questionable hacks. This should address PR36123 which is a case where in full LTO the memory usage of a machine module pass actually ended up being significant. We should revert this on trunk as soon as we understand and fix the memory usage issue, but we should include this in any backports of retpolines themselves. Reviewers: echristo, MatzeB Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42726 llvm-svn: 323915
* [Hexagon] Implement HVX codegen for vector shiftsKrzysztof Parzyszek2018-01-314-67/+54
| | | | llvm-svn: 323914
* [Hexagon] Handle ANY_EXTEND_VECTOR_INREG in loweringKrzysztof Parzyszek2018-01-311-0/+4
| | | | llvm-svn: 323912
* [Hexagon] Handle SETCC on vector pairs in loweringKrzysztof Parzyszek2018-01-312-1/+16
| | | | llvm-svn: 323911
* [GlobalOpt] Fix exponential compile-time with selects.Eli Friedman2018-01-311-17/+16
| | | | | | | | | | | | | | | If you have a long chain of select instructions created from something like `int* p = &g; if (foo()) p += 4; if (foo2()) p += 4;` etc., a naive recursive visitor will recursively visit each select twice, which is O(2^N) in the number of select instructions. Use the visited set to cut off recursion in this case. (No testcase because this doesn't actually change the behavior, just the time.) Differential Revision: https://reviews.llvm.org/D42451 llvm-svn: 323910
* AMDGPU: Fold inline offset for loads properly in moveToVALU on GFX9Marek Olsak2018-01-311-22/+31
| | | | | | | | | | | | | | | | Summary: This enables load merging into x2, x4, which is driven by inline offsets. 6500 shaders are affected: Code Size in affected shaders: -15.14 % Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D42078 llvm-svn: 323909
* AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}Marek Olsak2018-01-316-8/+78
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
* [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPsMarek Olsak2018-01-311-0/+2
| | | | | | | | | | | | | | Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907
* [MachineOutliner] Freeze registers in new functionsGeoff Berry2018-01-311-0/+2
| | | | | | | | | | | | | | | Summary: Call MRI.freezeReservedRegs() on functions created during outlining so that calls to isReserved() by the verifier called after this pass won't assert. Reviewers: MatzeB, qcolombet, paquette Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42749 llvm-svn: 323905
* [WebAssembly] MC: Remove unused code for handling of wasm globalsSam Clegg2018-01-314-139/+13
| | | | | | | | | | | | | | | For now, we are not using wasm globals, except for modeling of the stack points. Alos, factor out common struct WasmGlobalType, which matches the name for that tuple in the Wasm spec and rename methods to "isBindingGlobal", "isTypeGlobal" to avoid ambiguity. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42750 llvm-svn: 323901
* [WebAssembly] MC: Resolve aliases when creating provisional table entriesSam Clegg2018-01-311-31/+25
| | | | | | | | | | | | | | | | This change is useful for the upcoming addition of the symbol table (D41954) since in that world aliases for given function all share the same function index. This change does not effect lld because it essentially ignores the wasm "table". The table exists only to the wasm objects will validate and disassembly meaningfully. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42095 llvm-svn: 323900
* [X86] Generate testl instruction through truncates.Amaury Sechet2018-01-311-59/+44
| | | | | | | | | | | | | | | Summary: This was introduced in D42646 but ended up being reverted because the original implementation was buggy. Depends on D42646 Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42741 llvm-svn: 323899
* [Analysis] Disable calls to *_finite and other glibc-only functions on Android.Chih-Hung Hsieh2018-01-311-12/+6
| | | | | | | | | | Since r322087, glibc's finite lib calls are generated when possible. However, they are not supported on Android. This change also disables other functions not available on Android. Differential Revision: http://reviews.llvm.org/D42668 llvm-svn: 323898
* [CodeGenPrepare] Improve source and dest alignments of memory intrinsics ↵Daniel Neilson2018-01-311-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | independently Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the CodeGenPrepare pass to be more aggressive in improving the source and destination alignments of memcpy/memmove/memset by exploiting our new ability to record independent alignments for each argument. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323891
* [Hexagon] Handle BUILD_VECTOR from undef values in buildHvxVectorRegKrzysztof Parzyszek2018-01-311-1/+4
| | | | llvm-svn: 323889
* [X86] Avoid using high register trick for test instructionAmaury Sechet2018-01-314-37/+0
| | | | | | | | | | | | | Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323888
* [Hexagon] Only process bitcasts of vsplats when selecting const vectorsKrzysztof Parzyszek2018-01-311-1/+6
| | | | | | | | Selecting of constant HVX vectors involves some "manual processing", which mishandled an unrelated BITCAST operation causing a selection error. llvm-svn: 323887
* [Lint] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFCI)Daniel Neilson2018-01-311-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the Lint analysis to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323886
* [DWARF] Allow duplication of tails with CFI instructionsPetar Jovanovic2018-01-311-2/+16
| | | | | | | | | | | | | | | | | | This commit came as a result for revert of patch r317579 (originally committed as r317100). The patch made CFI instructions duplicable, because their existence in the epilogue block was affecting the Tail duplication pass. However, duplicating blocks with CFI instructions was an issue for compact unwind info on Darwin, which is why the patch was reverted. This patch allows duplicating tails with CFI instructions, though they are not duplicable, by copying them 'manually'. Patch by Djordje Kovacevic. Differential Revision: https://reviews.llvm.org/D40979 llvm-svn: 323883
* [DAG] Prevent NodeId pruning of TokenFactors in Instruction Selection.Nirav Dave2018-01-311-1/+3
| | | | | | | | | | | | | | | | | | Summary: Instruction Selection preserves relative orders of all nodes save TokenFactors which we treat specially. As a result Node Ids for TokenFactors may violate the topological ordering and should not be considered as valid pruning candidates in predecessor search. Fixes PR35316. Reviewers: RKSimon, hfinkel Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42701 llvm-svn: 323880
* Fix formatting for r323876. NFCDiana Picus2018-01-311-5/+5
| | | | llvm-svn: 323878
* [InstCombine] reduce code duplication for canEvaluate* functions; NFCISanjay Patel2018-01-311-47/+43
| | | | | | We'd have to make the change suggested in D42536 3x otherwise. llvm-svn: 323877
* [ARM GlobalISel] Modernize LegalizerInfo. NFCIDiana Picus2018-01-311-126/+68
| | | | | | | | | Start using the new LegalizerInfo API introduced in r323681. Keep the old API for opcodes that need Lowering in some circumstances (G_FNEG and G_UREM/G_SREM). llvm-svn: 323876
* [MachineCombiner] Add check for optimal pattern order.Florian Hahn2018-01-311-16/+82
| | | | | | | | | | | | | | | | | | | | | | In D41587, @mssimpso discovered that the order of some patterns for AArch64 was sub-optimal. I thought a bit about how we could avoid that case in the future. I do not think there is a need for evaluating all patterns for now. But this patch adds an extra (expensive) check, that evaluates the latencies of all patterns, and ensures that the latency saved decreases for subsequent patterns. This catches the sub-optimal order fixed in D41587, but I am not entirely happy with the check, as it only applies to sub-optimal patterns seen while building with EXPENSIVE_CHECKS on. It did not discover any other sub-optimal pattern ordering. Reviewers: Gerolf, spatel, mssimpso Reviewed By: Gerolf, mssimpso Differential Revision: https://reviews.llvm.org/D41766 llvm-svn: 323873
* Take into account the cost of local intervals when selecting split candidate.Marina Yatsina2018-01-312-12/+86
| | | | | | | | | | | | | When selecting a split candidate for region splitting, the register allocator tries to predict which candidate will have the cheapest spill cost. Global splitting may cause the creation of local intervals, and they might spill. This patch makes RA take into account the spill cost of local split intervals in use blocks (we already take into account the spill cost in through blocks). A flag ("-condsider-local-interval-cost") controls weather we do this advanced cost calculation (it's on by default for X86 target, off for the rest). Differential Revision: https://reviews.llvm.org/D41585 Change-Id: Icccb8ad2dbf13124f5d97a18c67d95aa6be0d14d llvm-svn: 323870
* [ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operationsPablo Barrio2018-01-311-0/+20
| | | | | | | | | | | | | | | | | | | Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869
* [SystemZ] Check the bitwidth before calling isInt/isUInt.Jonas Paulsson2018-01-311-1/+2
| | | | | | | | | Since these methods will assert if the integer does not fit into 64 bits, it is necessary to do this check before calling them in supportedAddressingMode(). Review: Ulrich Weigand. llvm-svn: 323866
* [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.Amjad Aboud2018-01-313-5/+19
| | | | | | | | | Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862
* [ARM] Armv8.2-A FP16 code generation (part 2/3)Sjoerd Meijer2018-01-313-35/+86
| | | | | | | | | | | | Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861
* [PowerPC] Return true in enableMultipleCopyHints().Jonas Paulsson2018-01-311-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Nemanja Ivanovic llvm-svn: 323858
* [ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ ↵Roger Ferrer Ibanez2018-01-313-4/+30
| | | | | | | | | | | | | | | | | | | | | | GPR. In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do copies CPSR ↔ GPR but not all Thumb1 targets implement them. The schedule can attempt, before attempting a copy, to clone the instructions but it does not currently do that for nodes with input glue. In this patch we introduce a target-hook to let the hook decide if a glued machinenode is still eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS . As a follow-up of this change we should actually implement the copies for the Thumb1 targets that do implement them and restrict the hook to the targets that can't really do such copy as these clones are not ideal. This change fixes PR35836. Differential Revision: https://reviews.llvm.org/D42051 llvm-svn: 323857
OpenPOWER on IntegriCloud