summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Fix a -Wsign-compareReid Kleckner2018-08-061-1/+1
| | | | llvm-svn: 339059
* [X86] Fix assertion in subreg extractionReid Kleckner2018-08-061-1/+1
| | | | | | | | | | | This assert fires when attempting to extract a subregister from the global PIC base register. This virtual register SD node is not in the VRBaseMap, so we shouldn't call getVR to look it up there. If this is a RegisterSDNode, we should be able to use the virtual register directly. Fixes PR38385 llvm-svn: 339056
* [SLC] Fix shrinking of pow()Evandro Menezes2018-08-061-13/+17
| | | | | | | | | Properly shrink `pow()` to `powf()` as a binary function and, when no other simplification applies, do not discard it. Differential revision: https://reviews.llvm.org/D50113 llvm-svn: 339046
* [llvm-pdbutil] Support PDBs without a DBI streamAlexandre Ganea2018-08-061-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D50258 llvm-svn: 339045
* [X86] Recognize a splat of negate in isFNEGEaswaran Raman2018-08-061-18/+77
| | | | | | | | | | | | | | | | | | | Summary: Expand isFNEG so that we generate the appropriate F(N)M(ADD|SUB) instructions in more cases. For example, the following sequence a = _mm256_broadcast_ss(f) d = _mm256_fnmadd_ps(a, b, c) generates an fsub and fma without this patch and an fnma with this change. Reviewers: craig.topper Subscribers: llvm-commits, davidxl, wmi Differential Revision: https://reviews.llvm.org/D48467 llvm-svn: 339043
* [X86] When using "and $0" and "orl $-1" to store 0 and -1 for minsize, make ↵Craig Topper2018-08-061-6/+12
| | | | | | | | | | sure the store isn't volatile If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source Differential Revision: https://reviews.llvm.org/D50270 llvm-svn: 339041
* [RegisterCoalescer] Delay live interval update work until the rematerializationWei Mi2018-08-061-6/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for all the uses from the same def is done. We run into a compile time problem with flex generated code combined with `-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants outside of a big loop, and drastically increases the compile time in global register splitting and copy coalescing. https://reviews.llvm.org/D49353 relieves the problem in global splitting. This patch is to handle the problem in copy coalescing. About the situation where the problem in copy coalescing happens. After machineLICM, we have several defs outside of a big loop with hundreds or thousands of uses inside the loop. Rematerialization in copy coalescing happens for each use and everytime rematerialization is done, shrinkToUses will be called to update the huge live interval. Because we have 'n' uses for a def, and each live interval update will have at least 'n' complexity, the total update work is n^2. To fix the problem, we try to do the live interval update work in a collective way. If a def has many copylike uses larger than a threshold, each time rematerialization is done for one of those uses, we won't do the live interval update in time but delay that work until rematerialization for all those uses are completed, so we only have to do the live interval update work once. Delaying the live interval update could potentially change the copy coalescing result, so we hope to limit that change to those defs with many (like above a hundred) copylike uses, and the cutoff can be adjusted by the option -mllvm -late-remat-update-threshold=xxx. Differential Revision: https://reviews.llvm.org/D49519 llvm-svn: 339035
* Fix raw_fd_ostream::write_impl hang due to an infinite loop with large outputOwen Reynolds2018-08-061-4/+4
| | | | | | | | | | On windows when raw_fd_ostream::write_impl calls write, a 32 bit input is required for character count. As a variable with size_t is used for this argument, on x64 integral demotion occurs. In the case of large files an infinite loop follows. See: https://bugs.llvm.org/show_bug.cgi?id=37926 This fix allows the output of files larger than the previous int32 limit. Differential Revision: https://reviews.llvm.org/D48948 llvm-svn: 339027
* AMDGPU: Fold v_lshl_or_b32 with 0 src0Matt Arsenault2018-08-061-0/+13
| | | | | | Appears from expansion of some packed cases. llvm-svn: 339025
* ValueTracking: Handle canonicalize in CannotBeNegativeZeroMatt Arsenault2018-08-061-0/+1
| | | | | | | Also fix apparently missing test coverage for any of the handling here. llvm-svn: 339023
* [NFC] Fixed unused function warningsDavid Bolvansky2018-08-061-0/+2
| | | | llvm-svn: 339021
* Revert unused function fixDavid Bolvansky2018-08-061-1/+1
| | | | llvm-svn: 339020
* [NFC] Fixed unused function warningDavid Bolvansky2018-08-061-1/+1
| | | | llvm-svn: 339019
* [AArch64] Fix assertion failure on widened f16 BUILD_VECTORBryan Chan2018-08-061-0/+9
| | | | | | | | | | | | | | | | Summary: Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the same type. This fixes an assertion failure in VerifySDNode. Reviewers: SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50202 llvm-svn: 339013
* Fix modules build with different technique to suppress Knuth debuggingTim Northover2018-08-061-37/+33
| | | | | | | | | | Currently we use #pragma push_macro(LLVM_DEBUG) to fiddle with the LLVM_DEBUG macro so that we can silence debugging the Knuth division algorithm unless it's actually desired. Unfortunately this is incompatible with enabling modules while building LLVM (via LLVM_ENABLE_MODULES=ON), probably due to a bug being fixed by D33004. llvm-svn: 339009
* ARM-MachO: don't add Thumb bit for addend to non-external relocation.Tim Northover2018-08-061-0/+1
| | | | | | | | | ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes out that part of any addend in an object file. But it only does that for symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting that extra bit in other cases. llvm-svn: 339007
* Re-enable "[ValueTracking] Teach isKnownNonNullFromDominatingCondition about ↵Max Kazantsev2018-08-061-10/+33
| | | | | | | | | | | AND" The patch was reverted because of bug detected by sanitizer. The bug is fixed, respective tests added. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 339005
* Revert rL338990 to see if it causes sanitizer failuresMax Kazantsev2018-08-061-28/+10
| | | | | | | | | Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this patch. Reverting to see if they sustain without it. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 338994
* Try to fix buildbotMax Kazantsev2018-08-061-1/+1
| | | | llvm-svn: 338991
* [ValueTracking] Teach isKnownNonNullFromDominatingCondition about ANDMax Kazantsev2018-08-061-10/+28
| | | | | | | | | | | `isKnownNonNullFromDominatingCondition` is able to prove non-null basing on `br` or `guard` by `%p != null` condition, but is unable to do so basing on `(%p != null) && %other_cond`. This patch allows it to do so. Differential Revision: https://reviews.llvm.org/D50172 Reviewed By: reames llvm-svn: 338990
* [GuardWidening] Widen guards with conditions of frequently taken dominated ↵Max Kazantsev2018-08-061-34/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | branches If there is a frequently taken branch dominated by a guard, and its condition is available at the point of the guard, we can widen guard with condition of this branch and convert the branch into unconditional: guard(cond1) if (cond2) { // taken in 99.9% cases // do something } else { // do something else } Converts to guard(cond1 && cond2) // do something Differential Revision: https://reviews.llvm.org/D49974 Reviewed By: reames llvm-svn: 338988
* [NFC] Fix typoXin Tong2018-08-061-1/+1
| | | | llvm-svn: 338987
* [NFC] Fixed unused function warningDavid Bolvansky2018-08-061-0/+2
| | | | llvm-svn: 338986
* [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.Hsiangkai Wang2018-08-0610-61/+56
| | | | | | | | | | | | | | | | In the past, DbgInfoIntrinsic has a strong assumption that these intrinsics all have variables and expressions attached to them. However, it is too strong to derive the class for other debug entities. Now, it has problems for debug labels. In order to make DbgInfoIntrinsic as a base class for 'debug info', I create a class for 'variable debug info', DbgVariableIntrinsic. DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it. Differential Revision: https://reviews.llvm.org/D50220 llvm-svn: 338984
* [ORC] Remove an incorrect use of 'cantFail'.Lang Hames2018-08-051-2/+4
| | | | | | | | This code was moved out from BasicObjectLayerMaterializationUnit, which required the supplied object to be well formed. The getObjectSymbolFlags function does not require a well-formed object, so we have to propagate the error here. llvm-svn: 338975
* [ORC] Change JITSymbolFlags debug output, add a function for getting a symbolLang Hames2018-08-052-30/+39
| | | | | | flags map from a buffer representing an object file. llvm-svn: 338974
* Enrich inline messagesDavid Bolvansky2018-08-055-111/+152
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338969
* Revert "Add a warning if someone attempts to add extra section flags to ↵Eric Christopher2018-08-051-36/+16
| | | | | | | | | | | sections" There are a bunch of edge cases and inconsistencies in how we're emitting sections cause this warning to fire and it needs more work. This reverts commit r335558. llvm-svn: 338968
* [TailCallElim] Preserve DT and PDTChijun Sima2018-08-042-27/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function. For example, ``` CallInst *CI = dyn_cast<CallInst>(&I); ... CI->setTailCall(); Modified = true; ... if (!Modified || ...) return PreservedAnalyses::all(); ``` After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call). When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch. Statistics: ``` Before the patch: 23010 dom-tree-stats - Number of DomTree recalculations 489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree After the patch: 21581 dom-tree-stats - Number of DomTree recalculations 475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree ``` Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49982 llvm-svn: 338954
* [ADCE] Remove the need of DomTreeChijun Sima2018-08-041-8/+10
| | | | | | | | | | | | | | Summary: ADCE doesn't need to query domtree. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49988 llvm-svn: 338950
* Reverted r338825 and all the following tries to fix issues introduced by ↵Galina Kistanova2018-08-042-380/+0
| | | | | | | | that commit (r338826, r338827, r338829, r338880). This commit has broken build bots and has been left unattended for too long. llvm-svn: 338948
* [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOPAditya Nandakumar2018-08-041-0/+20
| | | | | | | | https://reviews.llvm.org/D48600 Added IRTranslator support to translate these known intrinsics into GISel opcodes. llvm-svn: 338944
* Fix buildbot breakage.Rui Ueyama2018-08-041-2/+1
| | | | llvm-svn: 338940
* Use the same constants as zlib to represent compression level.Rui Ueyama2018-08-041-17/+4
| | | | | | | | | | This change allows users pass compression level that was not listed in the enum. Also, I think using different values than zlib's compression levels was just confusing. Differential Revision: https://reviews.llvm.org/D50196 llvm-svn: 338939
* [X86] Add isel patterns for atomic_load+sub+atomic_sub.Craig Topper2018-08-031-2/+1
| | | | | | Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register. llvm-svn: 338930
* [X86] Remove RELEASE_ and ACQUIRE_ pseudo instructions. Use isel patterns ↵Craig Topper2018-08-032-170/+73
| | | | | | | | | | | | and the normal instructions instead At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering. So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions. Differential Revision: https://reviews.llvm.org/D50212 llvm-svn: 338925
* [TRE][DebugInfo] Preserve Debug Location in new branch instructionAnastasis Grammenos2018-08-031-1/+2
| | | | | | | | | There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917
* [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked ↵Craig Topper2018-08-031-11/+28
| | | | | | | | | | store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915
* [Support] Don't initialize compressed buffer allocated by zlib::compressFangrui Song2018-08-031-2/+2
| | | | | | | | | | | resize() (zeroing) makes every allocated page resident. The actual size of the compressed buffer is usually much smaller. Making every page resident is wasteful. When linking a test binary with ~1.9GiB uncompressed debug info with LLD, this optimization decreases max RSS by ~1.5GiB. Differential Revision: https://reviews.llvm.org/50223 llvm-svn: 338913
* DAG: Enhance isKnownNeverNaNMatt Arsenault2018-08-035-17/+192
| | | | | | | | | | | | Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910
* [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").Artem Belevich2018-08-033-5/+12
| | | | | | | | | | | | | | Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908
* [X86] Add a DAG combine for the __builtin_parity idiom used by clang to ↵Craig Topper2018-08-031-0/+71
| | | | | | | | | | | | | | enable better codegen Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag. Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and. I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful. Differential Revision: https://reviews.llvm.org/D50165 llvm-svn: 338907
* [SLC] Refactor shrinking of functions (NFC)Evandro Menezes2018-08-031-72/+55
| | | | | | | Merge the helper functions for shrinking unary and binary functions into a single one, while keeping all their functionality. Otherwise, NFC. llvm-svn: 338905
* Fix crash in bounds checking.Joel Galenson2018-08-031-39/+43
| | | | | | | | In r337830 I added SCEV checks to enable us to insert fewer bounds checks. Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues. This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions. Differential Revision: https://reviews.llvm.org/D49946 llvm-svn: 338902
* [Partial Inlining] Fix small bug in detecting if we did somethingGraham Yiu2018-08-031-3/+1
| | | | | | | - It's possible for 'Changed' to return as false even if we did partial inline something. Fixed to accumulate return values llvm-svn: 338896
* [WebAssembly] Cleanup of the way globals and global flags are handledNicholas Wilson2018-08-0310-17/+36
| | | | | | Differential Revision: https://reviews.llvm.org/D44030 llvm-svn: 338894
* [Dominators] Make RemoveUnreachableBlocks return false if the BasicBlock is ↵Chijun Sima2018-08-031-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | already awaiting deletion Summary: Previously, `removeUnreachableBlocks` still returns true (which indicates the CFG is changed) even when all the unreachable blocks found is awaiting deletion in the DDT class. This makes code pattern like ``` // Code modified from lib/Transforms/Scalar/SimplifyCFGPass.cpp bool EverChanged = removeUnreachableBlocks(F, nullptr, DDT); ... do { EverChanged = someMightHappenModifications(); EverChanged |= removeUnreachableBlocks(F, nullptr, DDT); } while (EverChanged); ``` become a dead loop. Fix this by detecting whether a BasicBlock is already awaiting deletion. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49738 llvm-svn: 338882
* convert some tabs to spacesNico Weber2018-08-031-1/+1
| | | | llvm-svn: 338880
* [DebugInfo/Verifier] Don't emit error for missing module in indexJonas Devlieghere2018-08-031-1/+2
| | | | | | | | | We don't expect module names to be present in the index. This patch adds DW_TAG_module to the blacklist. Differential revision: https://reviews.llvm.org/D50237 llvm-svn: 338878
* [SystemZ] Improve handling of instructions which expand to several groupsJonas Paulsson2018-08-036-129/+170
| | | | | | | | | | | Some instructions expand to more than one decoder group. This has been hitherto ignored, but is handled with this patch. Review: Ulrich Weigand https://reviews.llvm.org/D50187 llvm-svn: 338849
OpenPOWER on IntegriCloud