summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on ↵Craig Topper2017-06-211-0/+46
| | | | | | | | | | | | | | | | | | | | | known bits Summary: I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know. This patch adds range metadata to these intrinsics based on the known bits. Currently the patch punts if we already have range metadata present. Reviewers: spatel, RKSimon, davide, majnemer Reviewed By: RKSimon Subscribers: sanjoy, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D32582 llvm-svn: 305927
* [InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, ↵Craig Topper2017-06-211-4/+16
| | | | | | | | | | | | | | | | | | | C2)) create more instructions than it removes Summary: Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184 This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34437 llvm-svn: 305926
* [Reassociate] Support xor reassociating for splat vectorsCraig Topper2017-06-211-24/+22
| | | | | | | | | | | | | | Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925
* [AMDGPU][MC][GFX9] Corrected VOP3P relevant code to fix disassembler failuresDmitry Preobrazhensky2017-06-214-11/+6
| | | | | | | | | | See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509 Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin Differential Revision: https://reviews.llvm.org/D34360 llvm-svn: 305923
* [DAG] Move BaseIndexOffset into separate Libarary. NFC.Nirav Dave2017-06-213-114/+97
| | | | | | | Move BaseIndexOffset analysis out of DAGCombiner for use in other files. llvm-svn: 305921
* [DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI.Nirav Dave2017-06-211-52/+69
| | | | | | | | Move GlobalAddress Offset decomposition from initial match into comparision check and removing the possibility of constructing a new offseted global address when examining addresses. llvm-svn: 305917
* [AMDGPU][MC] Corrected V_*QSAD* instructions to check that dest register is ↵Dmitry Preobrazhensky2017-06-213-5/+84
| | | | | | | | | | | | different than any of the src See Bug 33279: https://bugs.llvm.org//show_bug.cgi?id=33279 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D34003 llvm-svn: 305915
* [x86] fix formatting; NFCSanjay Patel2017-06-211-15/+13
| | | | llvm-svn: 305914
* [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.Christof Douma2017-06-214-5/+114
| | | | | | | | | | | | | | | | | | | | | | Implemented support to AArch64 codegen for ARMv8.1 Large System Extensions atomic instructions. Where supported, these instructions can provide atomic operations with higher performance. Currently supported operations include: fetch_add, fetch_or, fetch_xor, fetch_smin, fetch_min/max (signed and unsigned), swap, and compare_exchange. This implementation implies sequential-consistency ordering, more relaxed ordering is under development. Subtarget->hasLSE is currently supported for Cavium ThunderX2T99. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c llvm-svn: 305893
* [Support] Add RetryAfterSignal helper functionPavel Labath2017-06-213-25/+12
| | | | | | | | | | | | | | | | | | | | | Summary: This function retries an operation if it was interrupted by a signal (failed with EINTR). It's inspired by the TEMP_FAILURE_RETRY macro in glibc, but I've turned that into a template function. I've also added a fail-value argument, to enable the function to be used with e.g. fopen(3), which is documented to fail for any reason that open(2) can fail (which includes EINTR). The main user of this function will be lldb, but there were also a couple of uses within llvm that I could simplify using this function. Reviewers: zturner, silvas, joerg Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33895 llvm-svn: 305892
* [AArch64] Add early exit to promoteLoadFromStore.Florian Hahn2017-06-211-1/+4
| | | | | | | | There should be at most a single kill flag for the promoted operand between the store/load pair. Discussed in https://reviews.llvm.org/D34402. llvm-svn: 305889
* [MIPS] Fix for selecting of DINS/INS instructionStrahinja Petrovic2017-06-211-0/+5
| | | | | | | | | | This patch adds one more condition in selection DINS/INS instruction, which fixes MultiSource/Applications/JM/ldecod/ for mips32r2 (and mips64r2 n32 abi). Differential Revision: https://reviews.llvm.org/D33725 llvm-svn: 305888
* Use range-loop in machine-scheduler. NFCI.Javed Absar2017-06-211-94/+72
| | | | | | | | | | | | Converts to range-loop usage in machine scheduler. This makes the code neater and easier to read, and also keeps pace of the machine scheduler implementation with C++11 features. Reviewed by: Matthias Braun Differential Revision: https://reviews.llvm.org/D34320 llvm-svn: 305887
* [AMDGPU] SDWA: merge VI and GFX9 pseudo instructionsSam Kolton2017-06-2115-281/+323
| | | | | | | | | | | | Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 llvm-svn: 305886
* [AArch64] Preserve register flags when promoting a load from store.Florian Hahn2017-06-211-3/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885
* [DAGCombiner] Add another combine from build vector to shuffleGuy Blank2017-06-211-0/+5
| | | | | | | Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883
* [SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation timeMax Kazantsev2017-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressions like: x1 = a * a x2 = x1 * x1 x3 = x2 * x2 ... We actually have huge SCEVs with max allowed amount of operands inlined. Such expressions are easy to get from unrolling of loops looking like x = a for (i = 0; i < n; i++) x = x * x Or more tricky cases where big powers are involved. If some non-linear analysis tries to work with a SCEV that has 1000 operands, it may lead to excessively long compilation. The attached test does not pass within 1 minute with default threshold. This patch decreases its default value to 32, which looks much more reasonable if we use analyzes with complexity O(N^2) or O(N^3) working with SCEV. Differential Revision: https://reviews.llvm.org/D34397 llvm-svn: 305882
* [XRay] Reduce synthetic references emitted by XRayDean Michael Berris2017-06-211-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880
* [ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliasedSerguei Katkov2017-06-211-24/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now areMemoryOpsAliased has an assertion justified as: MMO1 should have a value due it comes from operation we'd like to use as implicit null check. assert(MMO1->getValue() && "MMO1 should have a Value!"); However, it is possible for that invariant to not be upheld in the following situation (conceptually): Null check %RAX NotNullSucc: %RAX = LEA %RSP, 16 // I0 %RDX = MOV64rm %RAX // I1 With the current code, we will have an early exit from ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable. However, I1 will look plausible (since it loads from %RAX) and will go ahead and call areMemoryOpsAliased(I1, I0). This will cause us to fail the assert mentioned above since I1 does not load from an IR level value and thus is allowed to have a non-Value base address. The fix is to bail out earlier whenever we see an unsuitable instruction overwrite PointerReg. This would guarantee that when we call areMemoryOpsAliased, we're guaranteed to be looking at an instruction that loads from or stores to an IR level value. Original Patch Author: sanjoy Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34385 llvm-svn: 305879
* [NewGVN] Fix a bug that made the store verifier less effective.Davide Italiano2017-06-201-6/+4
| | | | | | | | | We weren't actually checking for duplicated stores, as the condition was always actually false. This was found by Coverity, and I have no clue how to trigger this in real-world code (although I tried for a bit). llvm-svn: 305867
* clang-format a region.Rafael Espindola2017-06-201-20/+19
| | | | | | It will make a followup patch easier to read. llvm-svn: 305865
* [codeview] YAMLize all section offsets and indices in symbol recordsReid Kleckner2017-06-201-24/+22
| | | | | | | | | | | | We forgot to serialize these because llvm-readobj didn't dump them. They are typically all zeros in an object file. The linker fills them in with relocations before adding them to the PDB. Now we can properly round trip these symbols through pdb2yaml -> yaml2pdb. I made these fields optional with a zero default so that we can elide them from our test cases. llvm-svn: 305857
* Fix a crash in DwarfDebug::validThroughout.Adrian Prantl2017-06-201-3/+5
| | | | | | | | | | | The instruction it falls over on is an IMPLICT_DEF that also happens to be the only instruction in its lexical scope. That LexicalScope has never been created because its range is empty. This patch skips over all meta-instructions instead of just DBG_VALUEs. Thanks to David Blaikie for providing a testcase! llvm-svn: 305853
* [Statepoint] Add helper functions for GCRelocate and GCResultAnna Thomas2017-06-201-0/+12
| | | | | | | These functions isGCRelocate and isGCResult are similar to isStatepoint(const Value*). llvm-svn: 305847
* Support: chunk writing on LinuxSaleem Abdulrasool2017-06-202-1/+16
| | | | | | | | This is a workaround for large file writes. It has been witnessed that write(2) failing with EINVAL (22) due to a large value (>2G). Thanks to James Knight for the help with coming up with a sane test case. llvm-svn: 305846
* AMDGPU: Allow vectorization of packed typesMatt Arsenault2017-06-202-8/+20
| | | | llvm-svn: 305844
* [codeview] Fully initialize DataSym when mapping from YAMLReid Kleckner2017-06-201-0/+2
| | | | | | | | | | In the object file, the section index and relative offset are typically zero, so make these YAML fields optional with a default. It looks like there may be more partially initialized symbol records, but this should fix the msan bot. llvm-svn: 305842
* [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32Stanislav Mekhanoshin2017-06-201-0/+2
| | | | | | | | | If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 llvm-svn: 305840
* AMDGPU: Start adding global_* instructionsMatt Arsenault2017-06-206-6/+106
| | | | llvm-svn: 305838
* [GISel]: Add G_FMA opcode for fused multiply addsAditya Nandakumar2017-06-201-0/+7
| | | | | | | | https://reviews.llvm.org/D34372 Reviewed by dsanders llvm-svn: 305824
* AMDGPU: Do operand folding in program orderMatt Arsenault2017-06-201-5/+3
| | | | | | | | | Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. llvm-svn: 305821
* [PDB] Don't write uninitialized bytes to a PDB file.Zachary Turner2017-06-202-0/+3
| | | | | | | | | | There were certain fields that we didn't know how to write, as well as various padding bytes that we would ignore. This leads to garbage data in the PDB. While not strictly necessary, we should initialize these bytes to something meaningful, as it makes for easier binary comparison between PDBs. llvm-svn: 305819
* RegisterScavenging: Followup to r305625Matthias Braun2017-06-201-41/+38
| | | | | | | | | | | | | This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817
* AMDGPU: Preserve undef when folding register operandsMatt Arsenault2017-06-201-0/+2
| | | | | | | | If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. llvm-svn: 305816
* [AMDGPU] Eliminate SGPR to VGPR copy when possibleStanislav Mekhanoshin2017-06-201-0/+30
| | | | | | | | SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 llvm-svn: 305815
* AMDGPU: Fix crash with undef vreg input operandMatt Arsenault2017-06-201-1/+1
| | | | llvm-svn: 305814
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2017-06-201-1/+1
| | | | llvm-svn: 305813
* [GSoC] Flag value completion for clangYuka Takahashi2017-06-203-6/+39
| | | | | | | | | | | | This is patch for GSoC project, bash-completion for clang. To use this on bash, please run `source clang/utils/bash-autocomplete.sh`. bash-autocomplete.sh is code for bash-completion. In this patch, Options.td was mainly changed in order to add value class in Options.inc. llvm-svn: 305805
* [x86] enable CGP memcmp() expansion for 2/4/8 byte sizesSanjay Patel2017-06-203-1/+13
| | | | | | | | | There are a couple of potential improvements as seen in the IR and asm: 1. We're unnecessarily extending to a larger type to compare values. 2. The codegen for (select cond, 1, -1) could avoid a cmov. (or we could change the order of the compares, so we have a select with 0 operand) llvm-svn: 305802
* [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with ↵Simon Pilgrim2017-06-201-1/+2
| | | | | | | | >=16bit elements Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this llvm-svn: 305801
* DAG: correctly legalize UMULO.Tim Northover2017-06-201-11/+18
| | | | | | | | | We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800
* [InstCombine] fix code/test comments for r305792; NFCSanjay Patel2017-06-201-1/+1
| | | | | | | These diffs were in the last version of the patch in D33342, but I accidentally committed the previous rev. llvm-svn: 305793
* [InstCombine] try to canonicalize xor-of-icmps to and-of-icmpsSanjay Patel2017-06-201-0/+24
| | | | | | | | | | | | | | | | We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, we can use the truth table definition of xor: X ^ Y --> (X | Y) & !(X & Y) ...to see if we can convert the xor to and/or and then use the existing folds. http://rise4fun.com/Alive/J9v Differential Revision: https://reviews.llvm.org/D33342 llvm-svn: 305792
* [globalisel][tablegen] Add support for COPY_TO_REGCLASS.Daniel Sanders2017-06-202-10/+30
| | | | | | | | | | | | | | | | | | | | | | Summary: As part of this * Emitted instructions now have named MachineInstr variables associated with them. This isn't particularly important yet but it's a small step towards multiple-insn emission. * constrainSelectedInstRegOperands() is no longer hardcoded. It's now added as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses an alternate constraint mechanism ConstrainOperandToRegClassAction() which supports arbitrary constraints such as that defined by COPY_TO_REGCLASS. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33590 llvm-svn: 305791
* [X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODOSimon Pilgrim2017-06-201-2/+0
| | | | | | Target shuffle combining now supports the matching of INSERT_VECTOR_ELT/PINSRW/PINSRB for merging multiple insertions into shuffles/bitmasks. llvm-svn: 305788
* [GlobalISel][X86] fix compilation error ( -Werror=unused-function )Igor Breger2017-06-201-2/+2
| | | | llvm-svn: 305786
* [SelectionDAG] Fix an use-after-free issue introduced in r305775.Haojian Wu2017-06-201-2/+2
| | | | | | vector.back() will be invalidated when memory reallocation happens. llvm-svn: 305785
* [GlobalISel][X86] Get correct RegClass for given RegBank.Igor Breger2017-06-201-17/+26
| | | | | | | | | | | | | | | | | | | Summary: In some cases RegClass depends on target feature. Hight (16-31) vector registers exist only if AVX512f available. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: t.p.northover, guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33952 Conflicts: test/CodeGen/X86/GlobalISel/select-memop-scalar.mir llvm-svn: 305784
* [GlobalISel] combine not symmetric merge/unmerge nodes.Igor Breger2017-06-201-13/+58
| | | | | | | | | | | | | | | | Summary: In some cases legalization ends up with not symmetric merge/unmerge nodes. Transform it to merge/unmerge nodes. Reviewers: t.p.northover, qcolombet, zvi Reviewed By: t.p.northover Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33626 llvm-svn: 305783
* [SCEV][NFC] Fix a misleading description of AddOpsInlineThresholdMax Kazantsev2017-06-201-1/+1
| | | | | | | | | The description of this option was copy-pasted from another one and does not correspond to reality. Differential Revision: https://reviews.llvm.org/D34390 llvm-svn: 305782
OpenPOWER on IntegriCloud