summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] SDWA: merge VI and GFX9 pseudo instructionsSam Kolton2017-06-2115-281/+323
| | | | | | | | | | | | Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 llvm-svn: 305886
* [AArch64] Preserve register flags when promoting a load from store.Florian Hahn2017-06-212-4/+23
| | | | | | | | | | | | | | | | | | | | | Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885
* [DAGCombiner] Add another combine from build vector to shuffleGuy Blank2017-06-213-37/+18
| | | | | | | Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883
* [SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation timeMax Kazantsev2017-06-212-1/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressions like: x1 = a * a x2 = x1 * x1 x3 = x2 * x2 ... We actually have huge SCEVs with max allowed amount of operands inlined. Such expressions are easy to get from unrolling of loops looking like x = a for (i = 0; i < n; i++) x = x * x Or more tricky cases where big powers are involved. If some non-linear analysis tries to work with a SCEV that has 1000 operands, it may lead to excessively long compilation. The attached test does not pass within 1 minute with default threshold. This patch decreases its default value to 32, which looks much more reasonable if we use analyzes with complexity O(N^2) or O(N^3) working with SCEV. Differential Revision: https://reviews.llvm.org/D34397 llvm-svn: 305882
* Simplify test.Rafael Espindola2017-06-211-40/+8
| | | | llvm-svn: 305881
* [XRay] Reduce synthetic references emitted by XRayDean Michael Berris2017-06-219-55/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880
* [ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliasedSerguei Katkov2017-06-212-24/+308
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now areMemoryOpsAliased has an assertion justified as: MMO1 should have a value due it comes from operation we'd like to use as implicit null check. assert(MMO1->getValue() && "MMO1 should have a Value!"); However, it is possible for that invariant to not be upheld in the following situation (conceptually): Null check %RAX NotNullSucc: %RAX = LEA %RSP, 16 // I0 %RDX = MOV64rm %RAX // I1 With the current code, we will have an early exit from ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable. However, I1 will look plausible (since it loads from %RAX) and will go ahead and call areMemoryOpsAliased(I1, I0). This will cause us to fail the assert mentioned above since I1 does not load from an IR level value and thus is allowed to have a non-Value base address. The fix is to bail out earlier whenever we see an unsuitable instruction overwrite PointerReg. This would guarantee that when we call areMemoryOpsAliased, we're guaranteed to be looking at an instruction that loads from or stores to an IR level value. Original Patch Author: sanjoy Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34385 llvm-svn: 305879
* [NewGVN] Fix a bug that made the store verifier less effective.Davide Italiano2017-06-201-6/+4
| | | | | | | | | We weren't actually checking for duplicated stores, as the condition was always actually false. This was found by Coverity, and I have no clue how to trigger this in real-world code (although I tried for a bit). llvm-svn: 305867
* Updated llvm-objdump with Mach-O files and the -objc-meta-data option soKevin Enderby2017-06-202-2/+8
| | | | | | | | that it symbolically prints the superclass when it has dyld bind info for it. rdar://7638823 llvm-svn: 305866
* clang-format a region.Rafael Espindola2017-06-201-20/+19
| | | | | | It will make a followup patch easier to read. llvm-svn: 305865
* Add a cantFail overload for Expected-reference (Expected<T&>) types.Lang Hames2017-06-202-0/+25
| | | | llvm-svn: 305863
* [codeview] YAMLize all section offsets and indices in symbol recordsReid Kleckner2017-06-203-25/+62
| | | | | | | | | | | | We forgot to serialize these because llvm-readobj didn't dump them. They are typically all zeros in an object file. The linker fills them in with relocations before adding them to the PDB. Now we can properly round trip these symbols through pdb2yaml -> yaml2pdb. I made these fields optional with a zero default so that we can elide them from our test cases. llvm-svn: 305857
* Revert "Add previously accidentally uncommitted testcase for r305599."Adrian Prantl2017-06-201-82/+0
| | | | | | | | | This reverts commit r305852. The testcase already exists but I moved it to the X86 directory on a using a different machine and got confused... llvm-svn: 305856
* Make this test a bit more strict. NFC.Rafael Espindola2017-06-201-4/+3
| | | | llvm-svn: 305855
* Fix a crash in DwarfDebug::validThroughout.Adrian Prantl2017-06-202-3/+254
| | | | | | | | | | | The instruction it falls over on is an IMPLICT_DEF that also happens to be the only instruction in its lexical scope. That LexicalScope has never been created because its range is empty. This patch skips over all meta-instructions instead of just DBG_VALUEs. Thanks to David Blaikie for providing a testcase! llvm-svn: 305853
* Add previously accidentally uncommitted testcase for r305599.Adrian Prantl2017-06-201-0/+82
| | | | llvm-svn: 305852
* Change llvm-objdump with Mach-O files and the -info-plist option with theKevin Enderby2017-06-202-1/+6
| | | | | | | | -no-leading-headers option so that it does not print the leading header. rdar://27378808 llvm-svn: 305849
* [Statepoint] Add helper functions for GCRelocate and GCResultAnna Thomas2017-06-202-0/+15
| | | | | | | These functions isGCRelocate and isGCResult are similar to isStatepoint(const Value*). llvm-svn: 305847
* Support: chunk writing on LinuxSaleem Abdulrasool2017-06-203-1/+22
| | | | | | | | This is a workaround for large file writes. It has been witnessed that write(2) failing with EINVAL (22) due to a large value (>2G). Thanks to James Knight for the help with coming up with a sane test case. llvm-svn: 305846
* AMDGPU: Allow vectorization of packed typesMatt Arsenault2017-06-205-78/+249
| | | | llvm-svn: 305844
* [codeview] Fully initialize DataSym when mapping from YAMLReid Kleckner2017-06-201-0/+2
| | | | | | | | | | In the object file, the section index and relative offset are typically zero, so make these YAML fields optional with a default. It looks like there may be more partially initialized symbol records, but this should fix the msan bot. llvm-svn: 305842
* [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32Stanislav Mekhanoshin2017-06-202-0/+103
| | | | | | | | | If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 llvm-svn: 305840
* [cmake] Add support for using the standalone leaks sanitizer with LLVM.Michael Gottesman2017-06-201-0/+3
| | | | | | | | | | This commit causes LLVM_USE_SANITIZER to now accept the "Leaks" option. This will cause cmake to pass in -fsanitize=leak in all of the appropriate places. I am making this change so that I can setup a linux bot that only detects leaks. llvm-svn: 305839
* AMDGPU: Start adding global_* instructionsMatt Arsenault2017-06-207-6/+193
| | | | llvm-svn: 305838
* [GISel]: NFC. Add comment to G_FMA opcode as requested in rL305824Aditya Nandakumar2017-06-201-0/+1
| | | | llvm-svn: 305837
* [GISel]: Add G_FMA opcode for fused multiply addsAditya Nandakumar2017-06-204-0/+30
| | | | | | | | https://reviews.llvm.org/D34372 Reviewed by dsanders llvm-svn: 305824
* AMDGPU: Do operand folding in program orderMatt Arsenault2017-06-202-5/+50
| | | | | | | | | Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. llvm-svn: 305821
* [PDB] Don't write uninitialized bytes to a PDB file.Zachary Turner2017-06-202-0/+3
| | | | | | | | | | There were certain fields that we didn't know how to write, as well as various padding bytes that we would ignore. This leads to garbage data in the PDB. While not strictly necessary, we should initialize these bytes to something meaningful, as it makes for easier binary comparison between PDBs. llvm-svn: 305819
* Remove diff pedantic mode.Zachary Turner2017-06-203-191/+91
| | | | llvm-svn: 305818
* RegisterScavenging: Followup to r305625Matthias Braun2017-06-207-86/+83
| | | | | | | | | | | | | This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817
* AMDGPU: Preserve undef when folding register operandsMatt Arsenault2017-06-202-0/+8
| | | | | | | | If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. llvm-svn: 305816
* [AMDGPU] Eliminate SGPR to VGPR copy when possibleStanislav Mekhanoshin2017-06-207-8/+379
| | | | | | | | SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 llvm-svn: 305815
* AMDGPU: Fix crash with undef vreg input operandMatt Arsenault2017-06-202-1/+22
| | | | llvm-svn: 305814
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2017-06-201-1/+1
| | | | llvm-svn: 305813
* [CostModel][X86] Add scalar arithmetic cost testsSimon Pilgrim2017-06-201-7/+55
| | | | llvm-svn: 305810
* [CostModel][X86] Declare costs variables based on typeSimon Pilgrim2017-06-201-470/+470
| | | | | | The alphabetical progression isn't that useful llvm-svn: 305808
* [TableGen] Take a parameter by reference instead of pointer so we don't have ↵Craig Topper2017-06-201-4/+4
| | | | | | to add & on both callers. NFC llvm-svn: 305807
* [TableGen] Use range based for loop. NFCCraig Topper2017-06-201-3/+1
| | | | llvm-svn: 305806
* [GSoC] Flag value completion for clangYuka Takahashi2017-06-209-18/+80
| | | | | | | | | | | | This is patch for GSoC project, bash-completion for clang. To use this on bash, please run `source clang/utils/bash-autocomplete.sh`. bash-autocomplete.sh is code for bash-completion. In this patch, Options.td was mainly changed in order to add value class in Options.inc. llvm-svn: 305805
* [x86] enable CGP memcmp() expansion for 2/4/8 byte sizesSanjay Patel2017-06-205-42/+230
| | | | | | | | | There are a couple of potential improvements as seen in the IR and asm: 1. We're unnecessarily extending to a larger type to compare values. 2. The codegen for (select cond, 1, -1) could avoid a cmov. (or we could change the order of the compares, so we have a select with 0 operand) llvm-svn: 305802
* [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with ↵Simon Pilgrim2017-06-203-32/+9
| | | | | | | | >=16bit elements Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this llvm-svn: 305801
* DAG: correctly legalize UMULO.Tim Northover2017-06-202-11/+34
| | | | | | | | | We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800
* D33466: Make file non-executable.Vassil Vassilev2017-06-201-0/+0
| | | | llvm-svn: 305795
* [InstCombine] fix code/test comments for r305792; NFCSanjay Patel2017-06-202-3/+3
| | | | | | | These diffs were in the last version of the patch in D33342, but I accidentally committed the previous rev. llvm-svn: 305793
* [InstCombine] try to canonicalize xor-of-icmps to and-of-icmpsSanjay Patel2017-06-202-11/+31
| | | | | | | | | | | | | | | | We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, we can use the truth table definition of xor: X ^ Y --> (X | Y) & !(X & Y) ...to see if we can convert the xor to and/or and then use the existing folds. http://rise4fun.com/Alive/J9v Differential Revision: https://reviews.llvm.org/D33342 llvm-svn: 305792
* [globalisel][tablegen] Add support for COPY_TO_REGCLASS.Daniel Sanders2017-06-207-37/+179
| | | | | | | | | | | | | | | | | | | | | | Summary: As part of this * Emitted instructions now have named MachineInstr variables associated with them. This isn't particularly important yet but it's a small step towards multiple-insn emission. * constrainSelectedInstRegOperands() is no longer hardcoded. It's now added as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses an alternate constraint mechanism ConstrainOperandToRegClassAction() which supports arbitrary constraints such as that defined by COPY_TO_REGCLASS. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33590 llvm-svn: 305791
* Fix Wdocumentation warningSimon Pilgrim2017-06-201-2/+2
| | | | llvm-svn: 305790
* [X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODOSimon Pilgrim2017-06-201-2/+0
| | | | | | Target shuffle combining now supports the matching of INSERT_VECTOR_ELT/PINSRW/PINSRB for merging multiple insertions into shuffles/bitmasks. llvm-svn: 305788
* Fixed test name. NFCI.Simon Pilgrim2017-06-201-7/+7
| | | | llvm-svn: 305787
* [GlobalISel][X86] fix compilation error ( -Werror=unused-function )Igor Breger2017-06-201-2/+2
| | | | llvm-svn: 305786
OpenPOWER on IntegriCloud