summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Unroll] Add an option to control complete unrollingSerguei Katkov2019-09-193-9/+28
| | | | | | | | | | | | Add an ability to specify the max full unroll count for LoopUnrollPass pass in pass options. Reviewers: fhahn, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D67701 llvm-svn: 372305
* [X86] Prevent crash in LowerBUILD_VECTORvXi1 for v64i1 vectors on 32-bit ↵Craig Topper2019-09-191-6/+14
| | | | | | | | | targets when the vector is a mix of constants and non-constant. We need to materialize the constants as two 32-bit values that are casted to v32i1 and then concatenated. llvm-svn: 372304
* [X86] Change a SmallVector& argument to SmallVectorImpl&. NFCCraig Topper2019-09-191-1/+1
| | | | | | Avoids repeating the size. llvm-svn: 372302
* [X86] Remove unused argument from a helper function. NFCCraig Topper2019-09-191-4/+3
| | | | llvm-svn: 372301
* AMDGPU/SILoadStoreOptimizer: Add const to more functionsTom Stellard2019-09-191-12/+12
| | | | | | | | | | | | Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65901 llvm-svn: 372298
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzleMatt Arsenault2019-09-191-0/+1
| | | | llvm-svn: 372297
* AMDGPU/GlobalISel: RegBankSelect tbuffer load/storeMatt Arsenault2019-09-191-6/+14
| | | | | | These have the same operand structure as the non-t buffer operations. llvm-svn: 372296
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.formatMatt Arsenault2019-09-193-20/+122
| | | | | | | | | This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293
* AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.storeMatt Arsenault2019-09-197-11/+455
| | | | llvm-svn: 372292
* AMDGPU/GlobalISel: RegBankSelect struct buffer load/storeMatt Arsenault2019-09-191-0/+22
| | | | llvm-svn: 372291
* AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load|store}Matt Arsenault2019-09-192-1/+52
| | | | llvm-svn: 372290
* AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsicsMatt Arsenault2019-09-192-5/+104
| | | | | | Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289
* MachineScheduler: Fix assert from not checking subregsMatt Arsenault2019-09-191-2/+8
| | | | | | | The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287
* AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9Matt Arsenault2019-09-192-3/+11
| | | | | | The scalar versions were only introduced in gfx9. llvm-svn: 372286
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-1945-1250/+1301
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [Object] Extend MachOUniversalBinary::getObjectForArchAlexander Shaposhnikov2019-09-194-7/+22
| | | | | | | | | | | | Make the method MachOUniversalBinary::getObjectForArch return MachOUniversalBinary::ObjectForArch and add helper methods MachOUniversalBinary::getMachOObjectForArch, MachOUniversalBinary::getArchiveForArch for those who explicitly expect to get a MachOObjectFile or an Archive. Differential revision: https://reviews.llvm.org/D67700 Test plan: make check-all llvm-svn: 372278
* [WebAssembly] Restore defaults for stores per memopThomas Lively2019-09-181-10/+0
| | | | | | | | | | | | | | | | | | | | Summary: Large slowdowns were observed in Rust due to many small, constant sized copies in conjunction with poorly-optimized memory.copy implementations. Since memory.copy cannot be expected to be inlined efficiently by engines at this time, stop using it for the smallest copies. We continue to lower all memcpy intrinsics to memory.copy, though. Reviewers: aheejin, alexcrichton Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, JDevlieghere, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67639 llvm-svn: 372275
* [AArch64][GlobalISel] Support lowering musttail callsJessica Paquette2019-09-181-10/+24
| | | | | | | | | | | | | | | | | | | | | Since we now lower most tail calls, it makes sense to support musttail. Instead of always falling back to SelectionDAG, only fall back when a musttail call was not able to be emitted as a tail call. Once we can handle most incoming and outgoing arguments, we can change this to a `report_fatal_error` like in ISelLowering. Remove the assert that we don't have varargs and a musttail, and replace it with a return false. Implementing this requires that we implement `saveVarArgRegisters` from AArch64ISelLowering, which is an entirely different patch. Add GlobalISel lines to vararg-tallcall.ll to make sure that we produce correct code. Right now we only fall back, but eventually this will be relevant. Differential Revision: https://reviews.llvm.org/D67681 llvm-svn: 372273
* Remove the obsolete BlockByRefStruct flag from LLVM IRAdrian Prantl2019-09-185-70/+4
| | | | | | | | | | | | | | | | | | | | | | | DIFlagBlockByRefStruct is an unused DIFlag that originally was used by clang to express (Objective-)C block captures in debug info. For the last year Clang has been emitting complex DIExpressions to describe block captures instead, which makes all the code supporting this flag redundant. This patch removes the flag and all supporting "dead" code, so we can reuse the bit for something else in the future. Since this only affects debug info generated by Clang with the block extension this mostly affects Apple platforms and I don't have any bitcode compatibility concerns for removing this. The Verifier will reject debug info that uses the bit and thus degrade gracefully when LTO'ing older bitcode with a newer compiler. rdar://problem/44304813 Differential Revision: https://reviews.llvm.org/D67453 llvm-svn: 372272
* Add AutoUpgrade function to add new address space datalayout string to ↵Amy Huang2019-09-184-16/+28
| | | | | | | | | | | | | | | | | | | | | | existing datalayouts. Summary: Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings. This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid and don't already contain it. This also removes the compatibility changes in https://reviews.llvm.org/D66843. Datalayout change in https://reviews.llvm.org/D64931. Reviewers: rnk, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67631 llvm-svn: 372267
* [SimplifyCFG] mergeConditionalStoreToAddress(): try to pacify MSANRoman Lebedev2019-09-181-1/+1
| | | | | | | | MSAN bot complains that there is use-of-uninitialized-value of this FreeStores later in IsWorthwhile(). Perhaps FreeStores needs to be stored in a vector? llvm-svn: 372262
* On PowerPC, Secure-PLT by default for FreeBSD 13 and higherDimitry Andric2019-09-181-1/+2
| | | | | | | | | | | | | | | | | | | Summary: In https://svnweb.freebsd.org/changeset/base/349351, FreeBSD 13 and higher transitioned to Secure-PLT for PowerPC. This part contains the changes in llvm's PPC subtarget. Reviewers: emaste, jhibbits, hfinkel Reviewed By: jhibbits Subscribers: wuzish, nemanjai, krytarowski, kbarton, MaskRay, jsji, shchenz, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67118 llvm-svn: 372260
* [DAGCombine][ARM][X86] (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) foldRoman Lebedev2019-09-181-0/+12
| | | | | | | | | | | | | | | | | | | Summary: `DAGCombiner::visitADDLikeCommutative()` already has a sibling fold: `(add X, Carry) -> (addcarry X, 0, Carry)` This fold, as suggested by @efriedma, helps recover from //some// of the regressions of D62266 Reviewers: efriedma, deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D62392 llvm-svn: 372259
* [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)Roman Lebedev2019-09-181-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 llvm-svn: 372257
* [AArch64] Don't implicitly enable global isel on Darwin if code-model==large.Lang Hames2019-09-181-2/+4
| | | | | | | | | | | | | | | | Summary: AArch64 GlobalISel doesn't support MachO's large code model, so this patch adds a check for that combination before implicitly enabling it. Reviewers: paquette Subscribers: kristof.beyls, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67724 llvm-svn: 372256
* [SimplifyCFG] mergeConditionalStoreToAddress(): consider cost, not ↵Roman Lebedev2019-09-181-42/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | instruction count Summary: As it can be see in the changed test, while `div` is really costly, we were speculating it. This does not seem correct. Also, the old code would run for every single insturuction in BB, instead of eagerly bailing out as soon as there are too many instructions. This function still has a problem that `PHINodeFoldingThreshold` is per-basic-block, while it should be for all the basic blocks. Reviewers: efriedma, craig.topper, dmgreen, jmolloy Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67315 llvm-svn: 372255
* [MIPS] For vectors, select `add %x, C` as `sub %x, -C` if it results in ↵Roman Lebedev2019-09-182-0/+56
| | | | | | | | | | | | | | | | | | | | | | | inline immediate Summary: As discussed in https://reviews.llvm.org/D62341#1515637, for MIPS `add %x, -1` isn't optimal. Unlike X86 there are no fastpaths to matearialize such `-1`/`1` vector constants, and `sub %x, 1` results in better codegen, so undo canonicalization Reviewers: atanasyan, Petar.Avramovic, RKSimon Reviewed By: atanasyan Subscribers: sdardis, arichardson, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66805 llvm-svn: 372254
* [mips] Expand 'lw/sw' instructions for 32-bit GOTSimon Atanasyan2019-09-181-17/+64
| | | | | | | | | | In case of using 32-bit GOT access to the table requires two instructions with attached %got_hi and %got_lo relocations. This patch implements correct expansion of 'lw/sw' instructions in that case. Differential Revision: https://reviews.llvm.org/D67705 llvm-svn: 372251
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before ↵Roman Lebedev2019-09-181-5/+8
| | | | | | upcoming patch llvm-svn: 372245
* Fix compile-time regression caused by rL371928Daniel Sanders2019-09-181-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Also fixup rL371928 for cases that occur on our out-of-tree backend There were still quite a few intermediate APInts and this caused the compile time of MCCodeEmitter for our target to jump from 16s up to ~5m40s. This patch, brings it back down to ~17s by eliminating pretty much all of them using two new APInt functions (extractBitsAsZExtValue(), insertBits() but with a uint64_t). The exact conditions for eliminating them is that the field extracted/inserted must be <=64-bit which is almost always true. Note: The two new APInt API's assume that APInt::WordSize is at least 64-bit because that means they touch at most 2 APInt words. They statically assert that's true. It seems very unlikely that someone is patching it to be smaller so this should be fine. Reviewers: jmolloy Reviewed By: jmolloy Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67686 llvm-svn: 372243
* Data Dependence Graph BasicsBardia Mahjour2019-09-185-0/+386
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper: D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS. This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored. The algorithm for building the graph involves the following steps in order: 1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph. 2. For each node in the graph establish def-use edges to/from other nodes in the graph. 3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur, fhahn, myhsu Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto Tag: #llvm Differential Revision: https://reviews.llvm.org/D65350 llvm-svn: 372238
* [Alignment][NFC] Align(1) to Align::None() conversionsGuillaume Chatelet2019-09-182-2/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67715 llvm-svn: 372234
* [SampleFDO] Minimize performance impact when profile-sample-accurateWei Mi2019-09-181-20/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is enabled. We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSampleAccurate is true, function without sample will be regarded as cold and this could potentially cause performance regression. To minimize the potential negative performance impact, we want to be a little conservative here saying if a function shows up in the profile, no matter as outline instance, inline instance or call targets, treat the function as not being cold. This will handle the cases such as most callsites of a function are inlined in sampled binary (thus outline copy don't get any sample) but not inlined in current build (because of source code drift, imprecise debug information, or the callsites are all cold individually but not cold accumulatively...), so that the outline function showing up as cold in sampled binary will actually not be cold after current build. After the change, such function will be treated as not cold even profile-sample-accurate is enabled. At the same time we lower the hot criteria of callsiteIsHot check when profile-sample-accurate is enabled. callsiteIsHot is used to determined whether a callsite is hot and qualified for early inlining. When profile-sample-accurate is enabled, functions without profile will be regarded as cold and much less inlining will happen in CGSCC inlining pass, so we can worry less about size increase and be aggressive to allow more early inlining to happen for warm callsites and it is helpful for performance overall. Differential Revision: https://reviews.llvm.org/D67561 llvm-svn: 372232
* [Alignment][NFC] Remove LogAlignment functionsGuillaume Chatelet2019-09-1815-129/+109
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
* [Alignment][NFC] Use Align::None instead of 1Guillaume Chatelet2019-09-185-13/+13
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67704 llvm-svn: 372230
* Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"Krasimir Georgiev2019-09-186-29/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228
* [SimplifyLibCalls] fix crash with empty function name (PR43347)Sanjay Patel2019-09-181-15/+12
| | | | | | | | ...and improve some variable names while here. https://bugs.llvm.org/show_bug.cgi?id=43347 llvm-svn: 372227
* [SDA] Don't stop divergence propagation at the IPD.Jay Foad2019-09-181-35/+26
| | | | | | | | | | | | | | | | | | | Summary: This fixes B42473 and B42706. This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation. Reviewers: foad, nhaehnle Reviewed By: foad Subscribers: jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65274 llvm-svn: 372223
* [mips] Pass "xgot" flag as a subtarget featureSimon Atanasyan2019-09-183-9/+14
| | | | | | | | | We need "xgot" flag in the MipsAsmParser to implement correct expansion of some pseudo instructions in case of using 32-bit GOT (XGOT). MipsAsmParser does not have reference to MipsSubtarget but has a reference to "feature bit set". llvm-svn: 372220
* [mips] Reduce code duplication in the `loadAndAddSymbolAddress`. NFCSimon Atanasyan2019-09-181-106/+57
| | | | llvm-svn: 372218
* [AMDGPU] Allow FP inline constant in v_madak_f16 and v_fmaak_f16Tim Renouf2019-09-181-1/+3
| | | | | | | Differential Revision: https://reviews.llvm.org/D67680 Change-Id: Ic38f47cb2079c2c1070a441b5943854844d80a7c llvm-svn: 372208
* [AArch64][DebugInfo] Do not recompute CalleeSavedStackSizeSander de Smalen2019-09-186-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204
* Revert "r372201: [Support] Replace function with function_ref in ↵Ilya Biryukov2019-09-181-1/+1
| | | | | | | | | | | writeFileAtomically. NFC" function_ref causes calls to the function to be ambiguous, breaking compilation. Reverting for now. llvm-svn: 372202
* [Support] Replace function with function_ref in writeFileAtomically. NFCIlya Biryukov2019-09-181-1/+1
| | | | | | | | | | | | | | | | | | | Summary: The latter is slightly more efficient and communicates the intent of the API: writeFileAtomically does not own or copy the callback, it merely calls it at some point. Reviewers: jkorous Reviewed By: jkorous Subscribers: hiraditya, dexonsmith, jfb, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67584 llvm-svn: 372201
* [X86] Break non-power of 2 vXi1 vectors into scalars for argument passing ↵Craig Topper2019-09-181-10/+15
| | | | | | | | | | with avx512. This generates worse code, but matches what is done for avx2 and prevents crashes when more arguments are passed than we have registers for. llvm-svn: 372200
* [BPF] Permit all user instructed offset relocatiionsYonghong Song2019-09-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Currently, not all user specified relocations (with clang intrinsic __builtin_preserve_access_index()) will turn into relocations. In the current implementation, a __builtin_preserve_access_index() chain is turned into relocation only if the result of the clang intrinsic is used in a function call or a nonzero offset computation of getelementptr. For all other cases, the relocatiion request is ignored and the __builtin_preserve_access_index() is turned into regular getelementptr instructions. The main reason is to mimic bpf_probe_read() requirement. But there are other use cases where relocatable offset is generated but not used for bpf_probe_read(). This patch relaxed previous constraints when to generate relocations. Now, all user __builtin_preserve_access_index() will have relocations generated. Differential Revision: https://reviews.llvm.org/D67688 llvm-svn: 372198
* [X86] Prevent assertion when calling a function that returns double with ↵Craig Topper2019-09-181-0/+4
| | | | | | | | -mno-sse2 on x86-64. As seen in the most recent updates to PR10498 llvm-svn: 372197
* [Remarks] Allow the RemarkStreamer to be used directly with a streamFrancis Visoiu Mistrih2019-09-182-13/+46
| | | | | | | | | | The filename in the RemarkStreamer should be optional to allow clients to stream remarks to memory or to existing streams. This introduces a new overload of `setupOptimizationRemarks`, and avoids enforcing the presence of a filename at different places. llvm-svn: 372195
* [PGO] Change hardcoded thresholds for cold/inlinehint to use summaryTeresa Johnson2019-09-172-24/+27
| | | | | | | | | | | | | | | | | | | | | | | Summary: The PGO counter reading will add cold and inlinehint (hot) attributes to functions that are very cold or hot. This was using hardcoded thresholds, instead of the profile summary cutoffs which are used in other hot/cold detection and are more dynamic and adaptable. Switch to using the summary-based cold/hot detection. The hardcoded limits were causing some code that had a medium level of hotness (per the summary) to be incorrectly marked with a cold attribute, blocking inlining. Reviewers: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67673 llvm-svn: 372189
* [ARM] VFPv2 only supports 16 D registers.Eli Friedman2019-09-177-21/+27
| | | | | | | | | | | | | | | | | | | | r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186
OpenPOWER on IntegriCloud