summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [LSR] Tweak setup cost depth threshold to 10.Amara Emerson2019-05-101-1/+1
| | | | | | | | | | The original change introduced a depth limit of 7 which caused a 22% regression in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold still ensures that the original test case doesn't hang. rdar://50359639 llvm-svn: 360444
* [MC][ELF] Copy top 3 bits of st_other to .symver aliasesFangrui Song2019-05-101-0/+1
| | | | | | | | | | | | | | | On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local entry offset. A versioned symbol alias created by .symver should copy the bits from the source symbol. This partly fixes PR41048. A full fix needs tracking of .set assignments and updating st_other fields when finish() is called, see D56586. Patch by Alfredo Dal'Ava Júnior Differential Revision: https://reviews.llvm.org/D59436 llvm-svn: 360442
* Adjust MachineScheduler to use ProcResource countsMomchil Velikov2019-05-101-17/+55
| | | | | | | | | | | | | | This fix allows the scheduler to take into account the number of instances of each ProcResource specified. Previously a declaration in a scheduler of ProcResource<1> would be treated identically to a declaration of ProcResource<2>. Now the hazard recognizer would report a hazard only after all of the resource instances are busy. Patch by Jackson Woodruff and Momchil Velikov. Differential Revision: https://reviews.llvm.org/D51160 llvm-svn: 360441
* [X86] Avoid SFB - Fix inconsistent codegen with/without debug info Robert Lougher2019-05-101-2/+4
| | | | | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=40969 The functions findPotentiallyBlockedCopies and buildCopy are currently not accounting for the presence of debug instructions. In the former this results in the optimization not being trigerred, and in the latter results in inconsistent codegen. This patch enables the optimization to be performed in a debug build and ensures the codegen is consistent with non-debug builds. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61680 llvm-svn: 360436
* [X86][SSE] Add getHopForBuildVector vector splittingSimon Pilgrim2019-05-101-0/+16
| | | | | | | | If we only use the lower xmm of a ymm hop, then extract the xmm's (for free), perform the xmm hop and then insert back into a ymm (for free). Fixes some of the regressions noted in D61782 llvm-svn: 360435
* [InferAddressSpaces] Enhance the handling of cosntexpr.Michael Liao2019-05-101-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Constant expressions may not be added in strict postorder as the forward instruction scan order. Thus, for a constant express (CE0), if its operand (CE1) is used in an previous instruction, they are not in postorder. However, different from `cloneInstructionWithNewAddressSpace`, `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred instructions for later resolving. That results in failure of inferring constant address. - This patch adds the support to infer constant expression operand recursively, since there won't be loop, if that operand is another constant expression. Reviewers: arsenm Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61760 llvm-svn: 360431
* [PowerPC] custom lower `v2f64 fpext v2f32`Lei Huang2019-05-103-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32. eg. For the following IR %0 = load <2 x float>, <2 x float>* %Ptr, align 8 %1 = fpext <2 x float> %0 to <2 x double> ret <2 x double> %1 Pre custom lowering: ld r3, 0(r3) mtvsrd f0, r3 xxswapd vs34, vs0 xscvspdpn f0, vs0 xxsldwi vs1, vs34, vs34, 3 xscvspdpn f1, vs1 xxmrghd vs34, vs0, vs1 After custom lowering: lfd f0, 0(r3) xxmrghw vs0, vs0, vs0 xvcvspdp vs34, vs0 Differential Revision: https://reviews.llvm.org/D57857 llvm-svn: 360429
* SelectionDAG: accommodate atomic floating stores.Tim Northover2019-05-101-1/+4
| | | | | | | We were applying a pointer truncation to floating types, which crashed LLVM. That is Not A Good Thing(TM). llvm-svn: 360421
* [Object] Move ELF specific ObjectFile::getBuildAttributes to ELFObjectFileBaseFangrui Song2019-05-101-4/+2
| | | | | | | Change the return type from std::error_code to Error and make the function protected. llvm-svn: 360416
* [DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declareJeremy Morse2019-05-101-9/+26
| | | | | | | | | | | | | | | | | | | | In certain circumstances, optimizations pick line numbers from debug intrinsic instructions as the new location for altered instructions. This is problematic because the line number of a debugging intrinsic is meaningless (it doesn't produce any machine instruction), only the scope information is valid. The result can be the line number of a variable declaration "leaking" into real code from debugging intrinsics, making the line table un-necessarily jumpy, and potentially different with / without variable locations. Fix this by using zero line numbers when promoting dbg.declare intrinsics into dbg.values: this is safe for debug intrinsics as their line numbers are meaningless, and reduces the scope for damage / misleading stepping when optimizations pick locations from the wrong place. Differential Revision: https://reviews.llvm.org/D59272 llvm-svn: 360415
* [Object] Change SymbolicFile::printSymbolName to use ErrorFangrui Song2019-05-103-9/+7
| | | | llvm-svn: 360414
* [WebAssembly] Don't assume that strongly defined symbols are DSO-localSam Clegg2019-05-101-3/+3
| | | | | | | | | | | | | | | | The current PIC model for WebAssembly is more like ELF in that it allows symbol interposition. This means that more functions end up being addressed via the GOT and fewer directly added to the wasm table. One effect is a reduction in the number of wasm table entries similar to the previous attempt in https://reviews.llvm.org/D61539 which was reverted. Differential Revision: https://reviews.llvm.org/D61772 llvm-svn: 360402
* [WebAssembly] Remove friend18.C from list of known gcc torture test ↵Sam Clegg2019-05-101-1/+0
| | | | | | | | failures. NFC. Differential Revision: https://reviews.llvm.org/D61775 llvm-svn: 360401
* [llvm] X86DiscriminateMemOps: insert debug info when missingMircea Trofin2019-05-101-2/+3
| | | | | | | | | | | | | | Reviewers: davidxl Reviewed By: davidxl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61735 llvm-svn: 360396
* [AMDGPU] Pattern for v_xor3_b32Stanislav Mekhanoshin2019-05-101-1/+4
| | | | | | | | | This also allows three op patterns to use increased constant bus limit of GFX10. Differential Revision: https://reviews.llvm.org/D61763 llvm-svn: 360395
* [X86] Improve lowering of idemptotent RMW operationsPhilip Reames2019-05-091-1/+88
| | | | | | | | The current lowering uses an mfence. mfences are substaintially higher latency than the locked operations originally requested, but we do want to avoid contention on the original cache line. As such, use a locked instruction on a cache line assumed to be thread local. Differential Revision: https://reviews.llvm.org/D58632 llvm-svn: 360393
* [JITLink] Fixed a signedness bug when processing X86_64_RELOC_SUBTRACTOR.Lang Hames2019-05-091-2/+2
| | | | | | | | Subtractor relocation addends are signed, so we need to read them via signed int pointers. Accidentally treating 32-bit addends as unsigned leads to out-of-range errors when we try to add very large (>INT32_MAX) bogus addends. llvm-svn: 360392
* Compile time tweak for libcall lookupPhilip Reames2019-05-091-0/+5
| | | | | | | | If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare. Adding a fastpath reduces compile by 15% for one such example. The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly. llvm-svn: 360391
* [ORC] Simplify logic for updating edges when should-discard atoms are pruned.Lang Hames2019-05-091-16/+4
| | | | llvm-svn: 360384
* [JITLink] Improve/fix some JITLink debugging output.Lang Hames2019-05-092-9/+11
| | | | | | | | Adds full edge details (rather than just edge targets) when out-of-range errors are generated. Also fixes a bug where debugging output accessed an invalidated DenseMap iterator by moving the debugging output above the invalidation point. llvm-svn: 360383
* [ORC] Fix a formatting bug.Lang Hames2019-05-091-1/+1
| | | | llvm-svn: 360382
* Add ".dword" directiveBill Wendling2019-05-091-3/+5
| | | | | | | | | | | | | | | | | | Summary: The ".dword" directive is a synonym for ".xword" and is used used by klibc, a minimalistic libc subset for initramfs. Reviewers: t.p.northover, nickdesaulniers Reviewed By: nickdesaulniers Subscribers: nickdesaulniers, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61719 llvm-svn: 360381
* DebugInfo/DWARF: Minor expression simplificationDavid Blaikie2019-05-091-1/+1
| | | | llvm-svn: 360377
* [CodeGen] Add comment about FSUB <-> FNEG xformsCameron McInally2019-05-091-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D61741 llvm-svn: 360366
* [AMDGPU] gfx1010 v_interp_* instructionsStanislav Mekhanoshin2019-05-091-6/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D61703 llvm-svn: 360364
* [X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920)Simon Pilgrim2019-05-091-2/+3
| | | | | | | | | | As reported on PR39920, "slow horizontal ops" targets tend to internally expand to 2*shuffle+add/sub - so if we can reduce 2*shuffle+add/sub to a hadd/sub then we should do it - similar port usage but reduced instruction count. This works out in most cases, although the "PR22377" regression in vector-shuffle-combining.ll is annoying - going from 2*shuffle+add+shuffle to hadd+2*shuffle - I've opened PR41813 to cover this. Differential Revision: https://reviews.llvm.org/D61308 llvm-svn: 360360
* [DAGCombiner] Limit number of nodes explored as store candidates.Florian Hahn2019-05-091-2/+5
| | | | | | | | | | | | | | To find the candidates to merge stores we iterate over all nodes in a chain for each store, which leads to quadratic compile times for large basic blocks with a large number of stores. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61511 llvm-svn: 360357
* [AMDGPU] gfx1010 changes for PAL metadataStanislav Mekhanoshin2019-05-091-2/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D61704 llvm-svn: 360353
* MinidumpYAML: add support for the ThreadList streamPavel Labath2019-05-091-30/+96
| | | | | | | | | | | | | | | | | | | Summary: The implementation is a pretty straightforward extension of the pattern used for (de)serializing the ModuleList stream. Since there are other streams which use the same format (MemoryList and MemoryList64, at least). I tried to generalize the code a bit so that adding future streams of this type can be done with less code. Reviewers: amccarth, jhenderson, clayborg Subscribers: markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61423 llvm-svn: 360350
* [CodeGenPrepare] Limit recursion depth for collectBitPartsDavid Stuttard2019-05-091-7/+17
| | | | | | | | | | | | | | | | Summary: Seeing some issues for windows debug pathological cases with collectBitParts recursion (1525 levels of recursion!) Setting the limit to 64 as this should be sufficient - passes all lit cases Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61728 Change-Id: I7f44cdc6c1badf1c2ccbf1b0c4b6afe27ecb39a1 llvm-svn: 360347
* [X86] AMD Piledriver (BdVer2): major cleanup (mainly inverse throughput)Roman Lebedev2019-05-091-209/+303
| | | | | | | | | | | | | | | | I've started this cleanup more several times now, but got sidetracked elsewhere, e.g. by llvm-exegesis problems. Not this time, finally! This is mainly cleaning up the inverse throughput values, and a few latencies/uops, based on the llvm-exegesis measured values. Though this is not complete by any means, there's certainly more cleanup to be done. The performance numbers (i've only checked by RawSpeed benchmark) aren't really surprising - overall this *slightly* (< -1%) improves perf. llvm-svn: 360341
* [ARM][CGP] Guard against signext args and sitofpSam Parker2019-05-091-10/+12
| | | | | | | | | | Add an Argument that has the SExtAttr attached, as well as SIToFP instructions, as values that generate sign bits. SIToFP doesn't strictly do this and could be treated as a sink to be sign-extended. Differential Revision: https://reviews.llvm.org/D61381 llvm-svn: 360331
* [CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI.Simon Pilgrim2019-05-091-1/+3
| | | | llvm-svn: 360328
* Fix LLVM_USE_PERF build after getPageSize changeSven van Haastregt2019-05-091-3/+3
| | | | | | | | Commit r360221 ("[Support] Add error handling to sys::Process::getPageSize().", 2019-05-08) seems to have missed these uses of getPageSize(). Update them to getPageSizeEstimate(). llvm-svn: 360322
* [ARM GlobalISel] Map DBG_VALUE for types != s32Diana Picus2019-05-091-2/+8
| | | | | | | | ...and make sure we fail elegantly for unsupported values. s64 goes into DPR, anything <= 32 into GPR. llvm-svn: 360321
* X86WinAllocaExpander: Drop code looking through register copies (PR41786)Hans Wennborg2019-05-091-16/+4
| | | | | | | | | | | | | This code was never covered by tests, in PR41786 it was pointed out that the deletion part doesn't work, and in a full Chrome build I was never able to hit the code path that looks through copies. It seems the situation it's supposed to handle doesn't actually come up in practice. Delete it to simplify the code. Differential revision: https://reviews.llvm.org/D61671 llvm-svn: 360320
* Make sub-registers index names case sensitive in the MIRParserMarkus Lavin2019-05-091-1/+1
| | | | | | | | | | | | | | | | | | | Prior to this change sub-register index names are assumed to be lower case (but they are printed with original casing). This means that if a target has some upper case characters in its sub-register names then mir-export directly followed by mir-import is not possible. This also means that sub-register indices currently are (and will continue to be) slightly inconsistent with register names which are printed and assumed to be lower case. As the current textual representation of mir has a few inconsistencies in this area it is a bit arbitrary how to address the matter. This change is towards the direction that we feel is most correct (i.e. case sensitivity). Differential Revision: https://reviews.llvm.org/D61499 llvm-svn: 360318
* Bugfix for nullptr check by klocworkPengfei Wang2019-05-091-1/+2
| | | | | | | | | | Klocwork static check: Pointer from call to function `DebugLoc::operator DILocation *() const ` may be NULL and will be dereference in function `printExtendedName``` Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D61715 llvm-svn: 360317
* [CodeGen] Use "DL.getPointerSizeInBits" instead of "8 * DL.getPointerSize". NFCBjorn Pettersson2019-05-091-1/+1
| | | | llvm-svn: 360315
* [NewPM] Setup Passes for KASan and KMSanPetr Hosek2019-05-091-1/+4
| | | | | | | | | | While ASan and MSan passes were already ported to new PM, the kernel variants weren't setup in the pipeline which makes the KASan and KMSan tests in Clang fail. Differential Revision: https://reviews.llvm.org/D61664 llvm-svn: 360313
* [SelectionDAG] Expand ADD/SUBCARRYLeonard Chan2019-05-091-0/+42
| | | | | | | | This patch allows for expansion of ADDCARRY and SUBCARRY when the target does not support it. Differential Revision: https://reviews.llvm.org/D61411 llvm-svn: 360303
* Temporarily Revert "[DebugInfo] Terminate more location-list ranges at the ↵Eric Christopher2019-05-082-82/+20
| | | | | | | | | | end of blocks" as it was causing significant compile time regressions. This reverts commit r359426 while we come up with testcases and additional ideas. llvm-svn: 360301
* [SelectionDAG] fold 'fneg undef' to undefSanjay Patel2019-05-081-0/+4
| | | | | | | | | | | | | | | | | This is extracted from the original draft of D61419 with some additional tests. We don't currently get this in IR (it's conservatively turned into a NaN), but presumably that'll get updated as we add real IR support for 'fneg' rather than 'fsub -0.0, x'. The x86-32 run shows the following, and I haven't looked further to see why, but that seems to be independent: Legalizing: t1: f32 = undef Trying to expand node Creating fp constant: t4: f32 = ConstantFP<0.000000e+00> Differential Revision: https://reviews.llvm.org/D61516 llvm-svn: 360296
* AMDGPU: Mark scheduler classes as finalMatt Arsenault2019-05-081-2/+2
| | | | llvm-svn: 360294
* AMDGPU: Select VOP3 form of addMatt Arsenault2019-05-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The VOP3 form should always be the preferred selection, to be shrunk later. This should only be an optimization issue, but this partially works around a problem from clobbering VCC when SIFixSGPRCopies rewrites an SCC defining operation directly to VCC. 3 of the testcases are regressions from failing to fold the immediate in cases it should. These can be avoided by improving the VCC liveness handling in SIFoldOperands. Simply increasing the threshold to computeRegisterLiveness works, although this is common enough that VCC liveness should probably be tracked throughout the pass. The hack of leaving behind an implicit_def instruction to avoid breaking iterator wastes instruction count, which inhibits finding the VCC def in long chains of adds. Doing this however exposes different, worse looking regressions from poor scheduling behavior. This could probably be avoided around by forcing the shrink of the addc here, but the scheduler should probably be fixed. The r600 add test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 360293
* [FileCheck] Fix code style of method commentsThomas Preud'homme2019-05-081-47/+3
| | | | | | | | | | | | | | | | | | | | | Summary: Fix various issues in code style of method comments: 1) Move all heading comments to all non-static methods near their declaration in the FileCheck.h header file. 2) Harmonize the action verb in doxygen comments for methods to always be in third person 3) Use \returns instead of free text "return" and "returns". 4) Document a couple more parameters while at it. Reviewers: jhenderson, probinson, arichardson Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61445 llvm-svn: 360288
* [AMDGPU] gfx1010 exp modificationsStanislav Mekhanoshin2019-05-083-2/+17
| | | | | | Differential Revision: https://reviews.llvm.org/D61701 llvm-svn: 360287
* [InstCombine] When turning sext into zext due to known bits, return the new ↵Craig Topper2019-05-081-4/+2
| | | | | | | | | | | | ZExt instead of calling replaceinstuseswith The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements. My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement. Differential Revision: https://reviews.llvm.org/D61695 llvm-svn: 360284
* AMDGPU: Fix a mis-placed bracketChangpeng Fang2019-05-081-1/+1
| | | | | | | Differential Revision: https://reviews.llvm.org/D61430 llvm-svn: 360283
* [SCEV] Suppress hoisting insertion point of binops when unsafeWarren Ristow2019-05-081-18/+27
| | | | | | | | | | | | | | InsertBinop tries to move insertion-points out of loops for expressions that are loop-invariant. This patch adds a new parameter, IsSafeToHost, to guard that hoisting. This allows callers to suppress that hoisting for unsafe situations, such as divisions that may have a zero denominator. This fixes PR38697. Differential Revision: https://reviews.llvm.org/D55232 llvm-svn: 360280
OpenPOWER on IntegriCloud