summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [GlobalIsel][X86] support G_TRUNC selection.Igor Breger2017-04-194-0/+299
| | | | | | | | | | | | | | | | Summary: [GlobalIsel][X86] support G_TRUNC selection. Add regbank-select and legalizer tests. Currently legalization of trunc i64 on 32bit platform not supported. Reviewers: ab, zvi, rovka Reviewed By: zvi Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D32115 llvm-svn: 300678
* [X86] Add D32039/PR31357 tests to show current BSWAP codegenSimon Pilgrim2017-04-192-0/+255
| | | | llvm-svn: 300672
* [X86][SSE] Add scheduling latency/throughput tests for (most) SSE2 instructionsSimon Pilgrim2017-04-191-0/+6039
| | | | llvm-svn: 300671
* Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments"Renato Golin2017-04-191-94/+0
| | | | | | This reverts commit r300639, as it broke self-hosting on ARM. PR32709. llvm-svn: 300668
* [GlobalISel][X86] Split select tests. NFC.Igor Breger2017-04-197-444/+455
| | | | llvm-svn: 300666
* [ARM] GlobalISel: Add support for G_MULDiana Picus2017-04-194-1/+326
| | | | | | | | Support G_MUL, very similar to G_ADD and G_SUB. The only difference is in the instruction selector, where we have to select either MUL or MULv5 depending on the target. llvm-svn: 300665
* [GlobalISel] Support vector-of-pointers in LLTKristof Beyls2017-04-191-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300664
* Revert r300657 due to crashes in stage2 of bootstraps:Chandler Carruth2017-04-191-89/+0
| | | | | | | | | | | | http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio I've updated the commit thread, reverting to get the bots back to green. Original commit summary: [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. llvm-svn: 300662
* [JumpThread] We want to fold (not thread) when all predecessor go to single ↵Xin Tong2017-04-191-0/+89
| | | | | | | | | | | | | | | | BB's successor. . Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). Reviewers: efriedma, sanjoy Reviewed By: sanjoy Subscribers: dberlin, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D30869 llvm-svn: 300657
* ARMFrameLowering: Reserve emergency spill slot for large argumentsMatthias Braun2017-04-191-0/+94
| | | | | | | | | | | | We need to reserve an emergency spill slot in cases with large argument types that could overflow immediate offsets for FP relative address calculations. rdar://31317893 Differential Revision: https://reviews.llvm.org/D31643 llvm-svn: 300639
* [XRay][tools] Fix yaml matching to be more permissiveDean Michael Berris2017-04-191-4/+4
| | | | | | | | Account for a potentially empty function name. Follow-up to D32153. llvm-svn: 300631
* [XRay][tools] Add option to llvm-xray extract to symbolize functionsDean Michael Berris2017-04-181-0/+10
| | | | | | | | | | | | | | Summary: This allows us to, if the symbol names are available in the binary, be able to provide the function name in the YAML output. Reviewers: dblaikie, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32153 llvm-svn: 300624
* [x86] add tests for potential andn optimization; NFCSanjay Patel2017-04-181-2/+40
| | | | llvm-svn: 300617
* [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.Chih-Hung Hsieh2017-04-182-0/+59
| | | | | | | | | | Android x86_64 target uses f128 type and stores f128 values in %xmm* registers. SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value from f128 to i128. Differential Revision: http://reviews.llvm.org/D32102 llvm-svn: 300583
* [X86][SSE] Add scheduling latency/throughput tests for (most) SSE1 instructionsSimon Pilgrim2017-04-181-0/+2415
| | | | llvm-svn: 300576
* [SLP vectorizer] Allow phi node reordering in tryToVectorizeList.Easwaran Raman2017-04-181-0/+54
| | | | | | | | | | | | | | | | | In tryToVectorizeList, under a very limited circumstance (when entered from tryToVectorizePair), the values may be reordered (swapped) and the SLP tree is built with the new order. This extends that to the case when starting from phis in vectorizeChainsInBlock when there are exactly two phis. The textual order of phi nodes shouldn't really matter. Without this change, the loop body in the accompnaying test case is fully vectorized when we swap the orde of the phis but not with this order. While this doesn't solve the phi-ordering problem in a general way (for more than 2 phis), this is simple fix that piggybacks on an existing mechanism and is useful in cases like multiplying two complex numbers. Differential revision: https://reviews.llvm.org/D32065 llvm-svn: 300574
* [DAG] Improve store merge candidate pruning.Nirav Dave2017-04-182-12/+4
| | | | | | | | | | | | | | Remove non-consecutive stores from store merge candidate search as they cannot be merged and will prevent us from finding subsequent mergeable store cases. Reviewers: jyknight, bogner, javed.absar, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32086 llvm-svn: 300561
* Add base-index-based store merge testNirav Dave2017-04-181-0/+31
| | | | llvm-svn: 300559
* Make globalaa-retained.ll test catching more cases.Nikolai Bozhenov2017-04-181-3/+43
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: * Add checks for store. That is needed because GlobalsAA is called twice in the current pipeline with different sets of Function passes following it. However, the loads are eliminated using instcombine which happens everywhere. On the other hand, DeadStoreElimination is performed only once so by checking for store we'll be able to catch more cases when GlobalsAA is invalidated unintentionally. * Add empty function above/below the test so that we don't depend on the relative order of instcombine/dead-store-elimination and the pass that invalidates the analysis (inside the same FunctionPassManager). Reviewers: kristof.beyls Reviewed By: kristof.beyls Subscribers: llvm-commits, n.bozhenov Differential Revision: https://reviews.llvm.org/D32015 Patch by Andrei Elovikov <andrei.elovikov@intel.com> llvm-svn: 300553
* Add store Merge test.Nirav Dave2017-04-181-0/+25
| | | | llvm-svn: 300551
* [ARM] Add hardware build attributes in assemblerOliver Stannard2017-04-182-234/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the assembler, we should emit build attributes based on the target selected with command-line options. This matches the GNU assembler's behaviour. We only do this for build attributes which describe the hardware that is expected to be available, not the ones that describe ABI compatibility. This is done by moving some of the attribute emission code to ARMTargetStreamer, so that it can be shared between the assembly and code-generation code paths. Since the assembler only creates a MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to check raw features, and not use the convenience functions in ARMSubtarget. If different attributes are later specified using the .eabi_attribute directive, then they will take precedence, as happens when the same .eabi_attribute is specified twice. This must be enabled by an option, because we don't want to do this when parsing inline assembly. The attributes would match the ones emitted at the start of the file, so wouldn't actually change the emitted object file, but the extra directives would be added to every inline assembly block when emitting assembly, which we'd like to avoid. The majority of the changes in the build-attributes.ll test are just re-ordering the directives, because the hardware attributes are now emitted before the ABI ones. However, I did fix one bug which I spotted: Tag_CPU_arch_profile was not being emitted for v6M. Differential revision: https://reviews.llvm.org/D31812 llvm-svn: 300547
* [ARM] GlobalISel: Add support for G_SUBDiana Picus2017-04-185-0/+329
| | | | | | | Support G_SUB throughout the GlobalISel pipeline. It is exactly the same as G_ADD, nothing fancy. llvm-svn: 300546
* Revert "[GlobalISel] Support vector-of-pointers in LLT"Kristof Beyls2017-04-181-16/+0
| | | | | | | | | | | | | | | | This reverts r300535 and r300537. The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll produces slightly different code between LLVM versions being built with different compilers. E.g., dependent on the compiler LLVM is built with, either one of the following can be produced: remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement) remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement) Non-determinism like this is clearly a bad thing, so reverting this until I can find and fix the root cause of the non-determinism. llvm-svn: 300538
* [ARM] Check for correct HW div when lowering divmodDiana Picus2017-04-181-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For subtargets that use the custom lowering for divmod, e.g. gnueabi, we used to check if the subtarget has hardware divide and then lower to a div-mul-sub sequence if true, or to a libcall if false. However, judging by the usage of hasDivide vs hasDivideInARMMode, it seems that hasDivide only refers to Thumb. For instance, in the ARMTargetLowering constructor, the code that specifies whether to use libcalls for (S|U)DIV looks like this: bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide() : Subtarget->hasDivideInARMMode(); In the case of divmod for arm-gnueabi, using only hasDivide() to determine what to do means that instead of lowering to __aeabi_idivmod to get the remainder, we lower to div-mul-sub and then further lower the div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but not in Thumb, we generate a libcall instead of using it (this is not an issue in practice since AFAICT none of the cores that we support have hardware divide in ARM but not Thumb). This patch fixes the code dealing with custom lowering to take into account the mode (Thumb or ARM) when deciding whether or not hardware division is available. Differential Revision: https://reviews.llvm.org/D32005 llvm-svn: 300536
* [GlobalISel] Support vector-of-pointers in LLTKristof Beyls2017-04-181-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300535
* PR32382: Fix emitting complex DWARF expressions.Adrian Prantl2017-04-1814-29/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted. This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade. There are two major changes in this patch: I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the *address* of a variable as the first argument, even if the argument is not an alloca. When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0 All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger. https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000> Differential Revision: https://reviews.llvm.org/D31439 llvm-svn: 300522
* Build SymbolMap in SampleProfileLoader to help matchin function names with ↵Dehao Chen2017-04-172-0/+50
| | | | | | | | | | | | | | | | suffix. Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline. Reviewers: davidxl, dnovillo, tejohnson Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31952 llvm-svn: 300507
* Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mirHaicheng Wu2017-04-172-94/+105
| | | | | | Differential Revision: https://reviews.llvm.org/D32037 llvm-svn: 300506
* [WebAssembly] Fix WebAssemblyOptimizeReturned after r300367Jacob Gravelle2017-04-171-0/+31
| | | | | | | | | | | | | | | | Summary: Refactoring changed paramHasAttr(1 + i) to paramHasAttr(0), fix that to paramHasAttr(i). Add more tests to WebAssemblyOptimizeReturned that catch that regression. Reviewers: dschuff Subscribers: jfb, sbc100, llvm-commits Differential Revision: https://reviews.llvm.org/D32136 llvm-svn: 300502
* [InstCombine] Matchers work with both ConstExpr and Instructions.Davide Italiano2017-04-171-0/+22
| | | | | | | | | | So, `cast<Instruction>` is not guaranteed to succeed. Change the code so that we create a new constant and use it in the newly created instruction, as it's done in other places in InstCombine. OK'ed by Sanjay/Craig. Fixes PR32686. llvm-svn: 300495
* [InstSimplify] add/move tests for (icmp X, C1 & icmp X, C2); NFCSanjay Patel2017-04-172-20/+2912
| | | | | | We simplify based on range intersection, but we're missing folds. llvm-svn: 300493
* Update the test to fix the buildbot failure introduced by r300486 (NFC)Dehao Chen2017-04-171-12/+12
| | | | llvm-svn: 300492
* Add GNU_discriminator support for inline callsites in llvm-symbolizer.Dehao Chen2017-04-174-28/+92
| | | | | | | | | | | | | | Summary: LLVM symbolize cannot recognize GNU_discriminator for inline callsites. This patch adds support for it. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32134 llvm-svn: 300486
* AMDGPU: Use MachineRegisterInfo to find max used registerMatt Arsenault2017-04-173-16/+51
| | | | | | | | | | Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. llvm-svn: 300482
* [CodeGenPrepare] Fix crash due to an invalid CFGBrendon Cahoon2017-04-171-0/+37
| | | | | | | | | | | | | | | The splitIndirectCriticalEdges function generates and invalid CFG when the 'Target' basic block is a loop to itself. When this occurs, the code that updates the predecessor terminator needs to update the terminator in the split basic block. This occurs when there is an edge from block D back to D. Since D is split in to D0 and D1, the code needs to update the terminator in D1. But D1 is not in the OtherPreds vector, so it was not getting updated. Differential Revision: https://reviews.llvm.org/D32126 llvm-svn: 300480
* AArch64: put nonlazybind special handling behind a flag for now.Tim Northover2017-04-171-1/+9
| | | | | | | | It's basically a terrible idea anyway but objc_msgSend gets emitted like that. We can decide on a better way to deal with it in the unlikely event that anyone actually uses it. llvm-svn: 300474
* AMDGPU: Test handling of R_AMDGPU_ABS64 in RelocVisitorKonstantin Zhuravlyov2017-04-171-0/+72
| | | | llvm-svn: 300472
* AMDGPU: Set CodePointerSize to 8 for amdgcnKonstantin Zhuravlyov2017-04-172-2/+75
| | | | llvm-svn: 300470
* Bitcode: Add a string table to the bitcode format.Peter Collingbourne2017-04-1711-139/+146
| | | | | | | | | | | | | | | | | | | | | | | Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464
* AArch64: support nonlazybindTim Northover2017-04-171-0/+32
| | | | | | | | It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462
* AMDGPU: SimplifyDemandedElts for image intrinsicsMatt Arsenault2017-04-171-6/+1190
| | | | | | | | Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453
* [LoopPeeling] Get rid of Phis that become invariant after N stepsMax Kazantsev2017-04-171-3/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a generalization of the improvement introduced in rL296898. Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after N iterations, we can peel N times and turn it into invariant. In order to do this, we for every Phi in loop's header we define the Invariant Depth value which is calculated as follows: Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge]. If %y is a loop invariant, then Depth(%x) = 1. If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1. Otherwise, Depth(%x) is infinite. Notice that if we peel a loop, all Phis with Depth = 1 become invariants, and all other Phis with finite depth decrease the depth by 1. Thus, peeling N first iterations allows us to turn all Phis with Depth <= N into invariants. Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31613 llvm-svn: 300446
* [LoopPeeling] Fix condition for phi-eliminating peelingMax Kazantsev2017-04-172-1/+29
| | | | | | | | | | | | | | | | | When peeling loops basing on phis becoming invariants, we make a wrong loop size check. UP.Threshold should be compared against the total numbers of instructions after the transformation, which is equal to 2 * LoopSize in case of peeling one iteration. We should also check that the maximum allowed number of peeled iterations is not zero. Reviewers: sanjoy, anna, reames, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31753 llvm-svn: 300441
* [BPI] Use metadata info before any other heuristicsSerguei Katkov2017-04-171-0/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Metadata potentially is more precise than any heuristics we use, so it makes sense to use first metadata info if it is available. However it makes sense to examine it against other strong heuristics like unreachable one. If edge coming to unreachable block has higher probability then it is expected by unreachable heuristic then we use heuristic and remaining probability is distributed among other reachable blocks equally. An example where metadata might be more strong then unreachable heuristic is as follows: it is possible that there are two branches and for the branch A metadata says that its probability is (0, 2^25). For the branch B the probability is (1, 2^25). So the expectation is that first edge of B is hotter than first edge of A because first edge of A did not executed at least once. If first edge of A points to the unreachable block then using the unreachable heuristics we'll set the probability for A to (1, 2^20) and now edge of A becomes hotter than edge of B. This is unexpected behavior. This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214 Reviewers: sanjoy, junbuml, vsk, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits, reames, davidxl Differential Revision: https://reviews.llvm.org/D30631 llvm-svn: 300440
* [InstCombine] Simplify 1/X for vectors.Craig Topper2017-04-171-2/+5
| | | | llvm-svn: 300439
* [InstCombine] Add test cases for missing support for simplifying 1/X for ↵Craig Topper2017-04-171-0/+18
| | | | | | vectors. NFC llvm-svn: 300438
* [InstCombine] Add support for vector srem->urem.Craig Topper2017-04-171-1/+1
| | | | llvm-svn: 300437
* [InstCombine] Add missing testcases for srem->urem conversion. The vector ↵Craig Topper2017-04-171-0/+22
| | | | | | version isn't currently supported. NFC llvm-svn: 300436
* [InstCombine] Add support for turning vector sdiv into udiv.Craig Topper2017-04-172-8/+5
| | | | llvm-svn: 300435
* [InstCombine] Add test cases for missing support for turning vector sdiv ↵Craig Topper2017-04-172-0/+26
| | | | | | into udiv. NFC llvm-svn: 300434
OpenPOWER on IntegriCloud