summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [WebAssembly] Replace SIMD expression types with V128Derek Schuff2018-08-063-23/+13
| | | | | | | | | | | | Summary: The spec only defines a SIMD expression type of V128 and leaves interpretation of different vector types to the instructions. Differential Revision: https://reviews.llvm.org/D50367 Patch by Thomas Lively llvm-svn: 339082
* AMDGPU: cvt_pk_rtz_f16 canonicalizesMatt Arsenault2018-08-062-1/+15
| | | | llvm-svn: 339078
* AMDGPU: Handle some vector operations in isCanonicalizedMatt Arsenault2018-08-061-0/+20
| | | | llvm-svn: 339077
* AMDGPU: Push fcanonicalize through partially constant build_vectorMatt Arsenault2018-08-061-1/+37
| | | | | | | This usually avoids some re-packing code, and may help find canonical sources. llvm-svn: 339072
* AMDGPU: Refactor fcanonicalize combineMatt Arsenault2018-08-062-36/+32
| | | | | | This will make more complex combines easier. llvm-svn: 339070
* [LICM] Extract a helper function for readability [NFC]Philip Reames2018-08-061-8/+12
| | | | llvm-svn: 339069
* MC: Redirect .addrsig directives referring to private (.L) symbols to the ↵Peter Collingbourne2018-08-061-0/+2
| | | | | | | | | | | | | section symbol. This matches our behaviour for regular (i.e. relocated) references to private symbols and therefore avoids needing to unnecessarily write address-significant .L symbols to the object file's symbol table, which can interfere with stack traces. Fixes check-cfi after r339050. llvm-svn: 339066
* AMDGPU: Treat more custom operations as canonicalizingMatt Arsenault2018-08-062-2/+21
| | | | | | | | | | | | | | Everything should quiet, and I think everything should flush. I assume the min3/med3/max3 follow the same rules as regular min/max for flushing, which should at least be conservatively correct. There are still more operations that need to be handled. llvm-svn: 339065
* AMDGPU: Conversions always produce canonical resultsMatt Arsenault2018-08-061-7/+2
| | | | | | | | | Not sure why this was checking for denormals for f16. My interpretation of the IEEE standard is conversions should produce a canonical result, and the ISA manual says denormals are created when appropriate. llvm-svn: 339064
* AMDGPU: Fix implementation of isCanonicalizedMatt Arsenault2018-08-062-46/+77
| | | | | | | | | | | | | | | If denormals are enabled, denormals are canonical. Also fix a few other issues. minnum/maxnum are supposed to canonicalize. Temporarily improve workaround for the instruction behavior change in gfx9. Handle selects and fcopysign. The tests were also largely broken, since they were checking for a flush used on some targets after the store of the result. llvm-svn: 339061
* Fix a -Wsign-compareReid Kleckner2018-08-061-1/+1
| | | | llvm-svn: 339059
* [X86] Fix assertion in subreg extractionReid Kleckner2018-08-061-1/+1
| | | | | | | | | | | This assert fires when attempting to extract a subregister from the global PIC base register. This virtual register SD node is not in the VRBaseMap, so we shouldn't call getVR to look it up there. If this is a RegisterSDNode, we should be able to use the virtual register directly. Fixes PR38385 llvm-svn: 339056
* [SLC] Fix shrinking of pow()Evandro Menezes2018-08-061-13/+17
| | | | | | | | | Properly shrink `pow()` to `powf()` as a binary function and, when no other simplification applies, do not discard it. Differential revision: https://reviews.llvm.org/D50113 llvm-svn: 339046
* [llvm-pdbutil] Support PDBs without a DBI streamAlexandre Ganea2018-08-061-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D50258 llvm-svn: 339045
* [X86] Recognize a splat of negate in isFNEGEaswaran Raman2018-08-061-18/+77
| | | | | | | | | | | | | | | | | | | Summary: Expand isFNEG so that we generate the appropriate F(N)M(ADD|SUB) instructions in more cases. For example, the following sequence a = _mm256_broadcast_ss(f) d = _mm256_fnmadd_ps(a, b, c) generates an fsub and fma without this patch and an fnma with this change. Reviewers: craig.topper Subscribers: llvm-commits, davidxl, wmi Differential Revision: https://reviews.llvm.org/D48467 llvm-svn: 339043
* [X86] When using "and $0" and "orl $-1" to store 0 and -1 for minsize, make ↵Craig Topper2018-08-061-6/+12
| | | | | | | | | | sure the store isn't volatile If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source Differential Revision: https://reviews.llvm.org/D50270 llvm-svn: 339041
* [RegisterCoalescer] Delay live interval update work until the rematerializationWei Mi2018-08-061-6/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for all the uses from the same def is done. We run into a compile time problem with flex generated code combined with `-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants outside of a big loop, and drastically increases the compile time in global register splitting and copy coalescing. https://reviews.llvm.org/D49353 relieves the problem in global splitting. This patch is to handle the problem in copy coalescing. About the situation where the problem in copy coalescing happens. After machineLICM, we have several defs outside of a big loop with hundreds or thousands of uses inside the loop. Rematerialization in copy coalescing happens for each use and everytime rematerialization is done, shrinkToUses will be called to update the huge live interval. Because we have 'n' uses for a def, and each live interval update will have at least 'n' complexity, the total update work is n^2. To fix the problem, we try to do the live interval update work in a collective way. If a def has many copylike uses larger than a threshold, each time rematerialization is done for one of those uses, we won't do the live interval update in time but delay that work until rematerialization for all those uses are completed, so we only have to do the live interval update work once. Delaying the live interval update could potentially change the copy coalescing result, so we hope to limit that change to those defs with many (like above a hundred) copylike uses, and the cutoff can be adjusted by the option -mllvm -late-remat-update-threshold=xxx. Differential Revision: https://reviews.llvm.org/D49519 llvm-svn: 339035
* Fix raw_fd_ostream::write_impl hang due to an infinite loop with large outputOwen Reynolds2018-08-061-4/+4
| | | | | | | | | | On windows when raw_fd_ostream::write_impl calls write, a 32 bit input is required for character count. As a variable with size_t is used for this argument, on x64 integral demotion occurs. In the case of large files an infinite loop follows. See: https://bugs.llvm.org/show_bug.cgi?id=37926 This fix allows the output of files larger than the previous int32 limit. Differential Revision: https://reviews.llvm.org/D48948 llvm-svn: 339027
* AMDGPU: Fold v_lshl_or_b32 with 0 src0Matt Arsenault2018-08-061-0/+13
| | | | | | Appears from expansion of some packed cases. llvm-svn: 339025
* ValueTracking: Handle canonicalize in CannotBeNegativeZeroMatt Arsenault2018-08-061-0/+1
| | | | | | | Also fix apparently missing test coverage for any of the handling here. llvm-svn: 339023
* [NFC] Fixed unused function warningsDavid Bolvansky2018-08-061-0/+2
| | | | llvm-svn: 339021
* Revert unused function fixDavid Bolvansky2018-08-061-1/+1
| | | | llvm-svn: 339020
* [NFC] Fixed unused function warningDavid Bolvansky2018-08-061-1/+1
| | | | llvm-svn: 339019
* [AArch64] Fix assertion failure on widened f16 BUILD_VECTORBryan Chan2018-08-061-0/+9
| | | | | | | | | | | | | | | | Summary: Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the same type. This fixes an assertion failure in VerifySDNode. Reviewers: SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50202 llvm-svn: 339013
* Fix modules build with different technique to suppress Knuth debuggingTim Northover2018-08-061-37/+33
| | | | | | | | | | Currently we use #pragma push_macro(LLVM_DEBUG) to fiddle with the LLVM_DEBUG macro so that we can silence debugging the Knuth division algorithm unless it's actually desired. Unfortunately this is incompatible with enabling modules while building LLVM (via LLVM_ENABLE_MODULES=ON), probably due to a bug being fixed by D33004. llvm-svn: 339009
* ARM-MachO: don't add Thumb bit for addend to non-external relocation.Tim Northover2018-08-061-0/+1
| | | | | | | | | ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes out that part of any addend in an object file. But it only does that for symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting that extra bit in other cases. llvm-svn: 339007
* Re-enable "[ValueTracking] Teach isKnownNonNullFromDominatingCondition about ↵Max Kazantsev2018-08-061-10/+33
| | | | | | | | | | | AND" The patch was reverted because of bug detected by sanitizer. The bug is fixed, respective tests added. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 339005
* Revert rL338990 to see if it causes sanitizer failuresMax Kazantsev2018-08-061-28/+10
| | | | | | | | | Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this patch. Reverting to see if they sustain without it. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 338994
* Try to fix buildbotMax Kazantsev2018-08-061-1/+1
| | | | llvm-svn: 338991
* [ValueTracking] Teach isKnownNonNullFromDominatingCondition about ANDMax Kazantsev2018-08-061-10/+28
| | | | | | | | | | | `isKnownNonNullFromDominatingCondition` is able to prove non-null basing on `br` or `guard` by `%p != null` condition, but is unable to do so basing on `(%p != null) && %other_cond`. This patch allows it to do so. Differential Revision: https://reviews.llvm.org/D50172 Reviewed By: reames llvm-svn: 338990
* [GuardWidening] Widen guards with conditions of frequently taken dominated ↵Max Kazantsev2018-08-061-34/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | branches If there is a frequently taken branch dominated by a guard, and its condition is available at the point of the guard, we can widen guard with condition of this branch and convert the branch into unconditional: guard(cond1) if (cond2) { // taken in 99.9% cases // do something } else { // do something else } Converts to guard(cond1 && cond2) // do something Differential Revision: https://reviews.llvm.org/D49974 Reviewed By: reames llvm-svn: 338988
* [NFC] Fix typoXin Tong2018-08-061-1/+1
| | | | llvm-svn: 338987
* [NFC] Fixed unused function warningDavid Bolvansky2018-08-061-0/+2
| | | | llvm-svn: 338986
* [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.Hsiangkai Wang2018-08-0610-61/+56
| | | | | | | | | | | | | | | | In the past, DbgInfoIntrinsic has a strong assumption that these intrinsics all have variables and expressions attached to them. However, it is too strong to derive the class for other debug entities. Now, it has problems for debug labels. In order to make DbgInfoIntrinsic as a base class for 'debug info', I create a class for 'variable debug info', DbgVariableIntrinsic. DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it. Differential Revision: https://reviews.llvm.org/D50220 llvm-svn: 338984
* [ORC] Remove an incorrect use of 'cantFail'.Lang Hames2018-08-051-2/+4
| | | | | | | | This code was moved out from BasicObjectLayerMaterializationUnit, which required the supplied object to be well formed. The getObjectSymbolFlags function does not require a well-formed object, so we have to propagate the error here. llvm-svn: 338975
* [ORC] Change JITSymbolFlags debug output, add a function for getting a symbolLang Hames2018-08-052-30/+39
| | | | | | flags map from a buffer representing an object file. llvm-svn: 338974
* Enrich inline messagesDavid Bolvansky2018-08-055-111/+152
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338969
* Revert "Add a warning if someone attempts to add extra section flags to ↵Eric Christopher2018-08-051-36/+16
| | | | | | | | | | | sections" There are a bunch of edge cases and inconsistencies in how we're emitting sections cause this warning to fire and it needs more work. This reverts commit r335558. llvm-svn: 338968
* [TailCallElim] Preserve DT and PDTChijun Sima2018-08-042-27/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function. For example, ``` CallInst *CI = dyn_cast<CallInst>(&I); ... CI->setTailCall(); Modified = true; ... if (!Modified || ...) return PreservedAnalyses::all(); ``` After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call). When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch. Statistics: ``` Before the patch: 23010 dom-tree-stats - Number of DomTree recalculations 489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree After the patch: 21581 dom-tree-stats - Number of DomTree recalculations 475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree ``` Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49982 llvm-svn: 338954
* [ADCE] Remove the need of DomTreeChijun Sima2018-08-041-8/+10
| | | | | | | | | | | | | | Summary: ADCE doesn't need to query domtree. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49988 llvm-svn: 338950
* Reverted r338825 and all the following tries to fix issues introduced by ↵Galina Kistanova2018-08-042-380/+0
| | | | | | | | that commit (r338826, r338827, r338829, r338880). This commit has broken build bots and has been left unattended for too long. llvm-svn: 338948
* [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOPAditya Nandakumar2018-08-041-0/+20
| | | | | | | | https://reviews.llvm.org/D48600 Added IRTranslator support to translate these known intrinsics into GISel opcodes. llvm-svn: 338944
* Fix buildbot breakage.Rui Ueyama2018-08-041-2/+1
| | | | llvm-svn: 338940
* Use the same constants as zlib to represent compression level.Rui Ueyama2018-08-041-17/+4
| | | | | | | | | | This change allows users pass compression level that was not listed in the enum. Also, I think using different values than zlib's compression levels was just confusing. Differential Revision: https://reviews.llvm.org/D50196 llvm-svn: 338939
* [X86] Add isel patterns for atomic_load+sub+atomic_sub.Craig Topper2018-08-031-2/+1
| | | | | | Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register. llvm-svn: 338930
* [X86] Remove RELEASE_ and ACQUIRE_ pseudo instructions. Use isel patterns ↵Craig Topper2018-08-032-170/+73
| | | | | | | | | | | | and the normal instructions instead At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering. So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions. Differential Revision: https://reviews.llvm.org/D50212 llvm-svn: 338925
* [TRE][DebugInfo] Preserve Debug Location in new branch instructionAnastasis Grammenos2018-08-031-1/+2
| | | | | | | | | There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917
* [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked ↵Craig Topper2018-08-031-11/+28
| | | | | | | | | | store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915
* [Support] Don't initialize compressed buffer allocated by zlib::compressFangrui Song2018-08-031-2/+2
| | | | | | | | | | | resize() (zeroing) makes every allocated page resident. The actual size of the compressed buffer is usually much smaller. Making every page resident is wasteful. When linking a test binary with ~1.9GiB uncompressed debug info with LLD, this optimization decreases max RSS by ~1.5GiB. Differential Revision: https://reviews.llvm.org/50223 llvm-svn: 338913
* DAG: Enhance isKnownNeverNaNMatt Arsenault2018-08-035-17/+192
| | | | | | | | | | | | Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910
OpenPOWER on IntegriCloud