summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Split 128-bit vectors in BUILD_VECTOR loweringEli Friedman2016-12-145-26/+84
| | | | | | | | | | | | | Given that INSERT_VECTOR_ELT operates on D registers anyway, combining 64-bit vectors into a 128-bit vector is basically free. Therefore, try to split BUILD_VECTOR nodes before giving up and lowering them to a series of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically better lowerings; see testcases for examples. Inspired by similar code in the x86 backend for AVX. Differential Revision: https://reviews.llvm.org/D27624 llvm-svn: 289706
* fix gcc warning about a superfluous ;Nico Weber2016-12-141-1/+1
| | | | llvm-svn: 289705
* [InstCombine] Folding of a compare with RHS const should merge debug locationsRobert Lougher2016-12-142-1/+52
| | | | | | | | | | | | | If all the operands to a phi node are compares that have a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 8 of 8 for D26256. Folding of a compare that has a RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289704
* [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.Eli Friedman2016-12-145-25/+255
| | | | | | | | | | | | | | | | Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289703
* [DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of ↵Amjad Aboud2016-12-145-29/+33
| | | | | | | | | | | | FileName and Directory. This way it will be easier to expand DIFile (e.g., to contain checksum) without the need to modify the createCompileUnit() API. Reviewers: llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D27762 llvm-svn: 289702
* Fix build failure due to r289674 on certain systemsYaxun Liu2016-12-141-1/+0
| | | | | | Removed a useless include which caused conflict. llvm-svn: 289700
* [InstCombine] Folding of a binop with RHS const should merge the debug locationsRobert Lougher2016-12-142-1/+50
| | | | | | | | | | | | | If all the operands to a phi node are a binop with a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 7 of 8 for D26256. Folding of a binop with RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289699
* DebugInfo: Improve type safety and simplify some subprogram finalization codeDavid Blaikie2016-12-142-11/+9
| | | | | | | This probably ended up this way aften the subprogram<>function link inversion and debug info metadata schema changes. llvm-svn: 289697
* [GVNHoist] Move GVNHoist to function simplification part of pipeline.Geoff Berry2016-12-142-2/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move GVNHoist to later in the optimization pipeline, specifically, to the function simplification part of the pipeline. The new pipeline location allows GVNHoist to run on a function after its callees have been inlined but before the function has been considered for inlining into its callers, exposing more opportunities for hoisting. Performance results on AArch64 kryo: Improvements: Benchmarks/CoyoteBench/fftbench -24.952% spec2006/bzip2 -4.071% internal bmark -3.177% Benchmarks/PAQ8p/paq8p -1.754% spec2000/perlbmk -1.328% spec2006/h264ref -1.140% Regressions: internal bmark +1.818% Benchmarks/mafft/pairlocalalign +1.084% Reviewers: sebpop, dberlin, hiraditya Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27722 llvm-svn: 289696
* [WinEH] Avoid holding references to BlockColor (DenseMap) entries while ↵Andrew Kaylor2016-12-141-1/+5
| | | | | | | | inserting new elements Differential Revision: https://reviews.llvm.org/D27693 llvm-svn: 289694
* [InstCombine] When folding casts through a phi node merge the debug locationsRobert Lougher2016-12-142-1/+48
| | | | | | | | | | | | | If all the operands to a phi node are a cast, instcombine will try to pull them through the phi node, combining them into a single cast. When it does this, the debug location of the new cast should be the merged debug locations of the phi node arguments. Patch 6 of 8 for D26256. Folding of a cast operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289693
* Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots.Sean Callanan2016-12-141-0/+1
| | | | llvm-svn: 289691
* Prepare PrettyStackTrace for LLDB adoptionSean Callanan2016-12-142-5/+32
| | | | | | | | | | | | This patch fixes the linkage for __crashtracer_info__, making it have the proper mangling (extern "C") and linkage (private extern). It also adds a new PrettyStackTrace type, allowing LLDB to adopt this instead of Host::SetCrashDescriptionWithFormat(). Without this patch, CrashTracer on macOS won't pick up pretty stack traces from any LLVM client. An LLDB commit adopting this API will follow shortly. Differential Revision: https://reviews.llvm.org/D27683 llvm-svn: 289689
* [InstCombine] Folding loads through a phi node should merge the debug locationsRobert Lougher2016-12-142-1/+52
| | | | | | | | | | | | | If all the operands to a phi node are a load, instcombine will try to pull them through the phi node, combining them into a single load. When it does this, the debug location of the new load should be the merged debug locations of the phi node arguments. Patch 5 of 8 for D26256. Folding of a load operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289688
* [InstCombine] When folding GEP through a phi node merge the debug locationsRobert Lougher2016-12-142-1/+52
| | | | | | | | | | | | | If all the operands to a phi node are getelementptr, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new getelementptr should be the merged debug locations of the phi node arguments. Patch 4 of 8 for D26256. Folding of a getelementptr operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289684
* This change does two things:Eric Christopher2016-12-142-1/+6
| | | | | | | | | | | | | Adds a "Discriminator" field to struct DILineInfo, which defaults to 0. Fills out the "Discriminator" field in DILineInfo in DWARFDebugLine::LineTable::getFileLineInfoForAddress(). in order to have a slightly nicer interface in getFileLineInfoForAddress. Patch by Simon Que! Differential Revision: https://reviews.llvm.org/D27649 llvm-svn: 289683
* [InstCombine] Merge debug locations when folding through a phi nodeRobert Lougher2016-12-142-1/+52
| | | | | | | | | | | | | If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 3 of 8 for D26256. Folding of a compare operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289681
* [libFuzzer] disable msan for one more hook that reads target's data that ↵Kostya Serebryany2016-12-141-0/+3
| | | | | | might be uninitialized llvm-svn: 289680
* [InstCombine] Merge debug locations when folding through a phi nodeRobert Lougher2016-12-143-1/+91
| | | | | | | | | | | | | If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 2 of 8 for D26256. Folding of a binary operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289679
* revert r289669 which breaks botsDehao Chen2016-12-142-7/+0
| | | | llvm-svn: 289676
* AMDGPU: Emit runtime metadata version 2 as YAMLYaxun Liu2016-12-1414-2706/+806
| | | | | | Differential Revision: https://reviews.llvm.org/D25046 llvm-svn: 289674
* lit.cfg: Check value of build config rather than converting to booleanDerek Schuff2016-12-141-1/+1
| | | | | | This is a CMake var which never evaluates to false. llvm-svn: 289673
* AMDGPU: Make AllocationPriority of SGPRs higher than VGPRsMatt Arsenault2016-12-141-11/+13
| | | | | | | | Since SGPRs should spill to VGPRs, they should be allocated first. I don't think this is sufficient for SGPRs to always spill to VGPRs though. llvm-svn: 289671
* Create SampleProfileLoader pass in llvm instead of clangDehao Chen2016-12-142-0/+7
| | | | | | | | | | | | Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289669
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-12-1467-1781/+2004
| | | | | | | | | | UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667
* AMDGPU: Change vintrp printingMatt Arsenault2016-12-146-64/+95
| | | | llvm-svn: 289664
* Revert gold part of change, just libltoDerek Schuff2016-12-142-2/+1
| | | | llvm-svn: 289663
* Disable libLTO tests when libLTO is not builtDerek Schuff2016-12-142-2/+4
| | | | | | | | | | | | | | Summary: The current test only checks whether ld64 is available, causing tests to fail when ld64 is avilable but libLTO is not built. Reviewers: beanz, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27739 llvm-svn: 289662
* New API for merging debug locations. NFC.Robert Lougher2016-12-141-0/+18
| | | | | | | | | | | | | | | Given two debug locations the function getMergedLocation combines the locations into a single location (which may be an empty location). Please see https://reviews.llvm.org/D26256 for the discussion leading up to this API. Note the function is currently a stub. This allows optimisations to use the API although no location will actually be used. This is patch 1 out of 8 for D26256. As suggested by David Blaikie, each change in D26256 has been broken out into a separate patch. llvm-svn: 289661
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-12-1467-2004/+1781
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659
* Wdocumentation fixSimon Pilgrim2016-12-141-2/+1
| | | | llvm-svn: 289655
* [DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of ↵Simon Pilgrim2016-12-147-268/+163
| | | | | | | | | | | | just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654
* Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask ↵Michael Zuckerman2016-12-142-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | instruction adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945. Test explanation: Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP. In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers"). Missed optimization opportunity: vpcmpled %zmm0, %zmm1, %k0 knotw %k0, %k1 can be combine to vpcmpgtd %zmm0, %zmm2, %k1 Reviewers: 1. delena 2. igorb Commited after check all Differential Revision: https://reviews.llvm.org/D27160 llvm-svn: 289653
* [X86][SSE] Add AVX1 tests to sdiv/udiv srem/urem combine testsSimon Pilgrim2016-12-144-80/+180
| | | | | | As requested on D27714 llvm-svn: 289652
* Revert "[AVR] Add the very first on-target test"Renato Golin2016-12-144-22/+1
| | | | | | | This reverts commit r289648, as it's an execution test and relies on the emulator/dispatcher being available on all builders. llvm-svn: 289651
* Adapt to recent APFloat changeStephan Bergmann2016-12-141-6/+6
| | | | llvm-svn: 289649
* [AVR] Add the very first on-target testDylan McKay2016-12-144-1/+22
| | | | | | This test runs on actual AVR hardware. llvm-svn: 289648
* Replace APFloatBase static fltSemantics data members with getter functionsStephan Bergmann2016-12-1436-1076/+1098
| | | | | | | | | | | | | At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647
* Add a couple of assertions to the load combine code introduced by r289538Artur Pilipenko2016-12-141-1/+5
| | | | llvm-svn: 289646
* [AVR] Add the integrated testing tool to the .gitignoreDylan McKay2016-12-141-0/+2
| | | | | | We build it as an LLVM tool. llvm-svn: 289645
* [Assembler] Better error messages for .org directiveOliver Stannard2016-12-1414-42/+87
| | | | | | | | | | | | | | | | | | | | | Currently, the error messages we emit for the .org directive when the expression is not absolute or is out of range do not include the line number of the directive, so it can be hard to track down the problem if a file contains many .org directives. This patch stores the source location in the MCOrgFragment, so that it can be used for diagnostics emitted during layout. Since layout is an iterative process, and the errors are detected during each iteration, it would have been possible for errors to be reported multiple times. To prevent this, I've made the assembler bail out after each iteration if any errors have been reported. This will still allow multiple unrelated errors to be reported in the common case where they are all detected in the first round of layout. Differential Revision: https://reviews.llvm.org/D27411 llvm-svn: 289643
* [AVR] Add a function instrumentation passDylan McKay2016-12-145-0/+269
| | | | | | This will be used for an on-chip test suite. llvm-svn: 289641
* [X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar ↵Craig Topper2016-12-142-1/+177
| | | | | | floating point to integer conversion intrinsics. llvm-svn: 289639
* [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)Hal Finkel2016-12-143-44/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. This will yield subtle bugs if interposition is attempted. As a result, regardless of whether we're in PIC mode, we don't assume that we need to add the nop to support the possibility of ELF interposition. However, the necessary check is in place (i.e. calling GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for which interposition is allowed at the IR level, we'll add the nop as necessary. In the mean time, we'll generate more tail calls and fewer nops when compiling position-independent code. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 289638
* [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar ↵Craig Topper2016-12-143-20/+194
| | | | | | | | add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. llvm-svn: 289636
* [X86][InstCombine] Handle scalar fmadd intrinsics correctly in ↵Craig Topper2016-12-142-15/+22
| | | | | | | | SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. llvm-svn: 289632
* [ThinLTO] Add an API to trigger file-based API for returning objects to the ↵Mehdi Amini2016-12-1410-24/+239
| | | | | | | | | | | | | | | | | | | | linker Summary: The motivation is to support better the -object_path_lto option on Darwin. The linker needs to write down the generate object files on disk for later use by lldb or dsymutil (debug info are not present in the final binary). We're moving this into libLTO so that we can be smarter when a cache is enabled and hard-link when possible instead of duplicating the files. Reviewers: tejohnson, deadalnix, pcc Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D27507 llvm-svn: 289631
* [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round ↵Craig Topper2016-12-142-38/+21
| | | | | | | | | | | | intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289629
* [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar ↵Craig Topper2016-12-142-20/+34
| | | | | | | | | | | | min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289628
* Don't double-initialize cl::opt for iterating in reverse order to uncover ↵Mehdi Amini2016-12-141-1/+1
| | | | | | | | | | | non-determinism in codegen by default Bots are broken and needs to be fixed before having this on by default. The feature was committed in r289619. I tried to disable it in r289624 and failed because it was initialized in two places. llvm-svn: 289626
OpenPOWER on IntegriCloud