bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)	Roman Lebedev	2019-09-18	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 llvm-svn: 372257
*	[AArch64] Don't implicitly enable global isel on Darwin if code-model==large.	Lang Hames	2019-09-18	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: AArch64 GlobalISel doesn't support MachO's large code model, so this patch adds a check for that combination before implicitly enabling it. Reviewers: paquette Subscribers: kristof.beyls, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67724 llvm-svn: 372256
*	[SimplifyCFG] mergeConditionalStoreToAddress(): consider cost, not ↵	Roman Lebedev	2019-09-18	1	-42/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction count Summary: As it can be see in the changed test, while `div` is really costly, we were speculating it. This does not seem correct. Also, the old code would run for every single insturuction in BB, instead of eagerly bailing out as soon as there are too many instructions. This function still has a problem that `PHINodeFoldingThreshold` is per-basic-block, while it should be for all the basic blocks. Reviewers: efriedma, craig.topper, dmgreen, jmolloy Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67315 llvm-svn: 372255
*	[MIPS] For vectors, select `add %x, C` as `sub %x, -C` if it results in ↵	Roman Lebedev	2019-09-18	2	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inline immediate Summary: As discussed in https://reviews.llvm.org/D62341#1515637, for MIPS `add %x, -1` isn't optimal. Unlike X86 there are no fastpaths to matearialize such `-1`/`1` vector constants, and `sub %x, 1` results in better codegen, so undo canonicalization Reviewers: atanasyan, Petar.Avramovic, RKSimon Reviewed By: atanasyan Subscribers: sdardis, arichardson, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66805 llvm-svn: 372254
*	[mips] Expand 'lw/sw' instructions for 32-bit GOT	Simon Atanasyan	2019-09-18	1	-17/+64
\| \| \| \| \| \| \| \| \| \|	In case of using 32-bit GOT access to the table requires two instructions with attached %got_hi and %got_lo relocations. This patch implements correct expansion of 'lw/sw' instructions in that case. Differential Revision: https://reviews.llvm.org/D67705 llvm-svn: 372251
*	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before ↵	Roman Lebedev	2019-09-18	1	-5/+8
\| \| \| \| \| \|	upcoming patch llvm-svn: 372245
*	Fix compile-time regression caused by rL371928	Daniel Sanders	2019-09-18	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also fixup rL371928 for cases that occur on our out-of-tree backend There were still quite a few intermediate APInts and this caused the compile time of MCCodeEmitter for our target to jump from 16s up to ~5m40s. This patch, brings it back down to ~17s by eliminating pretty much all of them using two new APInt functions (extractBitsAsZExtValue(), insertBits() but with a uint64_t). The exact conditions for eliminating them is that the field extracted/inserted must be <=64-bit which is almost always true. Note: The two new APInt API's assume that APInt::WordSize is at least 64-bit because that means they touch at most 2 APInt words. They statically assert that's true. It seems very unlikely that someone is patching it to be smaller so this should be fine. Reviewers: jmolloy Reviewed By: jmolloy Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67686 llvm-svn: 372243
*	Data Dependence Graph Basics	Bardia Mahjour	2019-09-18	5	-0/+386
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper: D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS. This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored. The algorithm for building the graph involves the following steps in order: 1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph. 2. For each node in the graph establish def-use edges to/from other nodes in the graph. 3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur, fhahn, myhsu Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto Tag: #llvm Differential Revision: https://reviews.llvm.org/D65350 llvm-svn: 372238
*	[Alignment][NFC] Align(1) to Align::None() conversions	Guillaume Chatelet	2019-09-18	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67715 llvm-svn: 372234
*	[SampleFDO] Minimize performance impact when profile-sample-accurate	Wei Mi	2019-09-18	1	-20/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is enabled. We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSampleAccurate is true, function without sample will be regarded as cold and this could potentially cause performance regression. To minimize the potential negative performance impact, we want to be a little conservative here saying if a function shows up in the profile, no matter as outline instance, inline instance or call targets, treat the function as not being cold. This will handle the cases such as most callsites of a function are inlined in sampled binary (thus outline copy don't get any sample) but not inlined in current build (because of source code drift, imprecise debug information, or the callsites are all cold individually but not cold accumulatively...), so that the outline function showing up as cold in sampled binary will actually not be cold after current build. After the change, such function will be treated as not cold even profile-sample-accurate is enabled. At the same time we lower the hot criteria of callsiteIsHot check when profile-sample-accurate is enabled. callsiteIsHot is used to determined whether a callsite is hot and qualified for early inlining. When profile-sample-accurate is enabled, functions without profile will be regarded as cold and much less inlining will happen in CGSCC inlining pass, so we can worry less about size increase and be aggressive to allow more early inlining to happen for warm callsites and it is helpful for performance overall. Differential Revision: https://reviews.llvm.org/D67561 llvm-svn: 372232
*	[Alignment][NFC] Remove LogAlignment functions	Guillaume Chatelet	2019-09-18	15	-129/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
*	[Alignment][NFC] Use Align::None instead of 1	Guillaume Chatelet	2019-09-18	5	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67704 llvm-svn: 372230
*	Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"	Krasimir Georgiev	2019-09-18	6	-29/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ****************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED **************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228
*	[SimplifyLibCalls] fix crash with empty function name (PR43347)	Sanjay Patel	2019-09-18	1	-15/+12
\| \| \| \| \| \| \| \|	...and improve some variable names while here. https://bugs.llvm.org/show_bug.cgi?id=43347 llvm-svn: 372227
*	[SDA] Don't stop divergence propagation at the IPD.	Jay Foad	2019-09-18	1	-35/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes B42473 and B42706. This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation. Reviewers: foad, nhaehnle Reviewed By: foad Subscribers: jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65274 llvm-svn: 372223
*	[mips] Pass "xgot" flag as a subtarget feature	Simon Atanasyan	2019-09-18	3	-9/+14
\| \| \| \| \| \| \| \| \|	We need "xgot" flag in the MipsAsmParser to implement correct expansion of some pseudo instructions in case of using 32-bit GOT (XGOT). MipsAsmParser does not have reference to MipsSubtarget but has a reference to "feature bit set". llvm-svn: 372220
*	[mips] Reduce code duplication in the `loadAndAddSymbolAddress`. NFC	Simon Atanasyan	2019-09-18	1	-106/+57
\| \| \| \|	llvm-svn: 372218
*	[AMDGPU] Allow FP inline constant in v_madak_f16 and v_fmaak_f16	Tim Renouf	2019-09-18	1	-1/+3
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D67680 Change-Id: Ic38f47cb2079c2c1070a441b5943854844d80a7c llvm-svn: 372208
*	[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize	Sander de Smalen	2019-09-18	6	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204
*	Revert "r372201: [Support] Replace function with function_ref in ↵	Ilya Biryukov	2019-09-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	writeFileAtomically. NFC" function_ref causes calls to the function to be ambiguous, breaking compilation. Reverting for now. llvm-svn: 372202
*	[Support] Replace function with function_ref in writeFileAtomically. NFC	Ilya Biryukov	2019-09-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The latter is slightly more efficient and communicates the intent of the API: writeFileAtomically does not own or copy the callback, it merely calls it at some point. Reviewers: jkorous Reviewed By: jkorous Subscribers: hiraditya, dexonsmith, jfb, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67584 llvm-svn: 372201
*	[X86] Break non-power of 2 vXi1 vectors into scalars for argument passing ↵	Craig Topper	2019-09-18	1	-10/+15
\| \| \| \| \| \| \| \| \| \|	with avx512. This generates worse code, but matches what is done for avx2 and prevents crashes when more arguments are passed than we have registers for. llvm-svn: 372200
*	[BPF] Permit all user instructed offset relocatiions	Yonghong Song	2019-09-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, not all user specified relocations (with clang intrinsic __builtin_preserve_access_index()) will turn into relocations. In the current implementation, a __builtin_preserve_access_index() chain is turned into relocation only if the result of the clang intrinsic is used in a function call or a nonzero offset computation of getelementptr. For all other cases, the relocatiion request is ignored and the __builtin_preserve_access_index() is turned into regular getelementptr instructions. The main reason is to mimic bpf_probe_read() requirement. But there are other use cases where relocatable offset is generated but not used for bpf_probe_read(). This patch relaxed previous constraints when to generate relocations. Now, all user __builtin_preserve_access_index() will have relocations generated. Differential Revision: https://reviews.llvm.org/D67688 llvm-svn: 372198
*	[X86] Prevent assertion when calling a function that returns double with ↵	Craig Topper	2019-09-18	1	-0/+4
\| \| \| \| \| \| \| \|	-mno-sse2 on x86-64. As seen in the most recent updates to PR10498 llvm-svn: 372197
*	[Remarks] Allow the RemarkStreamer to be used directly with a stream	Francis Visoiu Mistrih	2019-09-18	2	-13/+46
\| \| \| \| \| \| \| \| \| \|	The filename in the RemarkStreamer should be optional to allow clients to stream remarks to memory or to existing streams. This introduces a new overload of `setupOptimizationRemarks`, and avoids enforcing the presence of a filename at different places. llvm-svn: 372195
*	[PGO] Change hardcoded thresholds for cold/inlinehint to use summary	Teresa Johnson	2019-09-17	2	-24/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The PGO counter reading will add cold and inlinehint (hot) attributes to functions that are very cold or hot. This was using hardcoded thresholds, instead of the profile summary cutoffs which are used in other hot/cold detection and are more dynamic and adaptable. Switch to using the summary-based cold/hot detection. The hardcoded limits were causing some code that had a medium level of hotness (per the summary) to be incorrectly marked with a cold attribute, blocking inlining. Reviewers: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67673 llvm-svn: 372189
*	[ARM] VFPv2 only supports 16 D registers.	Eli Friedman	2019-09-17	7	-21/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186
*	[PGO] Don't use comdat groups for counters & data on COFF	Reid Kleckner	2019-09-17	1	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For COFF, a comdat group is really a symbol marked IMAGE_COMDAT_SELECT_ANY and zero or more other symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE. Typically the associative symbols in the group are not external and are not referenced by other TUs, they are things like debug info, C++ dynamic initializers, or other section registration schemes. The Visual C++ linker reports a duplicate symbol error for symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE even if they would be discarded after handling the leader symbol. Fixes coverage-inline.cpp in check-profile after r372020. llvm-svn: 372182
*	[AArch64][GlobalISel] Support -tailcallopt	Jessica Paquette	2019-09-17	1	-23/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for `-tailcallopt` tail calls to CallLowering. This piggy-backs off the changes from D67577, since doing it without a bit of refactoring gets extremely ugly. Support is basically ported from AArch64ISelLowering. The main difference here is that tail calls in `-tailcallopt` change the ABI, so there's some extra bookkeeping for the stack. Show that we are correctly lowering these by updating tail-call.ll. Also show that we don't do anything strange in general by updating fastcc-reserved.ll, which passes `-tailcallopt`, but doesn't emit any tail calls. Differential Revision: https://reviews.llvm.org/D67580 llvm-svn: 372177
*	AArch64CallLowering::lowerCall(): fix build by not passing InArgs into ↵	Roman Lebedev	2019-09-17	1	-1/+1
\| \| \| \| \| \|	lowerTailCall() llvm-svn: 372172
*	[NFC][InstCombine] dropRedundantMaskingOfLeftShiftInput(): some NFC diff shaving	Roman Lebedev	2019-09-17	1	-8/+11
\| \| \| \|	llvm-svn: 372171
*	Revert "Data Dependence Graph Basics"	Bardia Mahjour	2019-09-17	5	-386/+0
\| \| \| \| \| \|	This reverts commit c98ec60993a7aa65073692b62f6d728b36e68ccd, which broke the sphinx-docs build. llvm-svn: 372168
*	NVPTXAsmPrinter - Don't dereference a dyn_cast result. NFCI.	Simon Pilgrim	2019-09-17	1	-1/+1
\| \| \| \|	llvm-svn: 372166
*	WasmEmitter - Don't dereference a dyn_cast result. NFCI.	Simon Pilgrim	2019-09-17	1	-1/+1
\| \| \| \|	llvm-svn: 372165
*	[AArch64][GlobalISel][NFC] Refactor tail call lowering code	Jessica Paquette	2019-09-17	2	-31/+74
\| \| \| \| \| \| \| \| \| \|	When you begin implementing -tailcallopt, this gets somewhat hairy. Refactor the call lowering code so that the tail call lowering stuff gets its own function. Differential Revision: https://reviews.llvm.org/D67577 llvm-svn: 372164
*	Data Dependence Graph Basics	Bardia Mahjour	2019-09-17	5	-0/+386
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper: D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS. This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored. The algorithm for building the graph involves the following steps in order: 1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph. 2. For each node in the graph establish def-use edges to/from other nodes in the graph. 3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur, fhahn, myhsu Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto Tag: #llvm Differential Revision: https://reviews.llvm.org/D65350 llvm-svn: 372162
*	[X86] Use APInt::operator<<= and APInt::lshrInPlace. NFC	Craig Topper	2019-09-17	1	-4/+4
\| \| \| \|	llvm-svn: 372159
*	[SimplifyDemandedBits] Use APInt::intersects to instead of ANDing and ↵	Craig Topper	2019-09-17	1	-2/+3
\| \| \| \| \| \|	comparing to 0 separately. NFC llvm-svn: 372158
*	[X86] Simplify b2b KSHIFTL+KSHIFTR using demanded elts.	Craig Topper	2019-09-17	1	-13/+66
\| \| \| \|	llvm-svn: 372155
*	[X86] Call SimplifyDemandedVectorElts on KSHIFTL/KSHIFTR nodes during DAG ↵	Craig Topper	2019-09-17	1	-0/+16
\| \| \| \| \| \|	combine. llvm-svn: 372154
*	[X86] Simplify some code in LowerBUILD_VECTORvXi1. NFCI	Craig Topper	2019-09-17	1	-15/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The case were Immediate is 0 and HasConstElts is true should never happen since that would mean the constant elts were all zero. But we check for all zero build vector earlier. So just use HasConstElts and blindly take Immediate without checking if its 0. Move the code that bitcasts and extract the immediate into the the HasConstElts case since the other code just creates an undef with the right type. No casting needed. llvm-svn: 372153
*	[AMDGPU] Added MI bit IsDOT	Stanislav Mekhanoshin	2019-09-17	6	-2/+22
\| \| \| \| \| \| \| \|	NFC, needed for future commit. Differential Revision: https://reviews.llvm.org/D67669 llvm-svn: 372151
*	GSYM: Add the llvm::gsym::Header header class with tests	Greg Clayton	2019-09-17	2	-0/+112
\| \| \| \| \| \| \| \|	This patch adds the llvm::gsym::Header class which appears at the start of a stand alone GSYM file, or in the first bytes of the GSYM data in a GSYM section within a file. Added encode and decode methods with full error handling and full tests. Differential Revision: https://reviews.llvm.org/D67666 llvm-svn: 372149
*	[ARM][AsmParser] Don't dereference a dyn_cast result. NFCI.	Simon Pilgrim	2019-09-17	1	-50/+41
\| \| \| \| \| \|	The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 372145
*	Reland "[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, ↵	David Bolvansky	2019-09-17	1	-1/+4
\| \| \| \| \| \|	'\0', y)" llvm-svn: 372142
*	[PowerPC] Exploit single instruction load-and-splat for word and doubleword	Nemanja Ivanovic	2019-09-17	4	-14/+115
\| \| \| \| \| \| \| \| \| \| \|	We currently produce a load, followed by (possibly a move for integers and) a splat as separate instructions. VSX has always had a splatting load for doublewords, but as of Power9, we have it for words as well. This patch just exploits these instructions. Differential revision: https://reviews.llvm.org/D63624 llvm-svn: 372139
*	[MemorySSA] Fix phi insertion when inserting a def.	Alina Sbirlea	2019-09-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When inserting a Def, the current algorithm is walking edges backward and inserting new Phis where needed. There may be additional Phis needed in the IDF of the newly inserted Def and Phis. Adding Phis in the IDF of the Def was added ina previous patch, but we may also need other Phis in the IDF of the newly added Phis. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67637 llvm-svn: 372138
*	[MemorySSA] Update MSSA for non-conventional AA.	Alina Sbirlea	2019-09-17	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Regularly when moving an instruction that may not read or write memory, the instruction is not modelled in MSSA, so not action is necessary. For a non-conventional AA pipeline, MSSA needs to explicitly check when creating accesses, so as to not model instructions that may not read and write memory. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67562 llvm-svn: 372137
*	GSYM: add encoding and decoding to FunctionInfo	Greg Clayton	2019-09-17	1	-4/+132
\| \| \| \| \| \| \| \|	This patch adds encoding and decoding of the FunctionInfo objects along with full error handling and tests. Full details of the FunctionInfo encoding format appear in the FunctionInfo.h header file. Differential Revision: https://reviews.llvm.org/D67506 llvm-svn: 372135
*	[ARM] Add a SelectTAddrModeImm7 for MVE narrow loads and stores	David Green	2019-09-17	2	-10/+35
\| \| \| \| \| \| \| \| \| \| \| \|	We were previously using the SelectT2AddrModeImm7 for both normal and narrowing MVE loads/stores. As the narrowing instructions do not accept sp as a register, it makes little sense to optimise a FrameIndex into the load, only to have to recover that later on. This adds a SelectTAddrModeImm7 which does not do that folding, and uses it for narrowing load/store patterns. Differential Revision: https://reviews.llvm.org/D67489 llvm-svn: 372134