summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "[CodeView] Add option to disable inline line tables."Amy Huang2019-10-301-30/+11
| | | | | | because it breaks compiler-rt tests. This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
* [CodeView] Add option to disable inline line tables.Amy Huang2019-10-301-11/+30
| | | | | | | | | | | | | | | | | Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723
* [InstCombine] keep assumption before sinking callstyker2019-10-311-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: in the following C code the branch is not removed by clang in O3. ``` int f1(char* p) { int i1 = __builtin_strlen(p); if (!p) return -1; return i1; } ``` The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null. This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull. This resolves this issue at O3. Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69477
* Fix -Wsign-compare warning with clang-clReid Kleckner2019-10-301-2/+2
| | | | | | off_t apparently is just "long" on Win64, which is 32-bits, and therefore not long enough to compare with UINT32_MAX. Use auto to follow the surrounding code. uint64_t would also be fine.
* [X86] Model MXCSR for all SSE instructionsCraig Topper2019-10-304-51/+97
| | | | | | | | | | | | | | | This patch adds MXCSR as a reserved physical register and models its use by X86 SSE instructions. It also adds flag "mayRaiseFPException" for the instructions that possibly can raise FP exception according to the architecture definition. Following what SystemZ and other targets does, only the current rounding modes and the IEEE exception masks are modeled. *Changes* of the MXCSR due to exceptions are not modeled. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D68121
* AMDGPU: Disallow spill folding with m0 copiesMatt Arsenault2019-10-302-0/+42
| | | | | | | | | | readlane and writelane instructions are not allowed to use m0 as the data operand, so spilling them is tricky and would require an intermediate SGPR to spill it. Constrain the virtual register class in this caes to disallow the inline spiller from folding the m0 operand directly into the spill instruction. I copied this hack from AArch64 which has the same problem for $sp.
* AMDGPU: Don't fold S_NOPs with implicit operandsMatt Arsenault2019-10-301-1/+3
|
* RegAllocFast: Use RegisterMatt Arsenault2019-10-301-69/+69
|
* [X86] Rewrite hasReassociableOperands and setSpecialOperandAttr to not ↵Craig Topper2019-10-301-28/+19
| | | | | | hardcode number of operands or position of the EFLAGS operand. This makes the code immune to the MXCSR addition in D68121.
* [clang][llvm] Obsolete Exynos M1 and M2Evandro Menezes2019-10-306-896/+5
|
* [JITLink] Add a utility for splitting blocks at a given index.Lang Hames2019-10-301-0/+79
| | | | | | | | LinkGraph::splitBlock will split a block at a given index, returning a new block covering the range [ 0, index ) and modifying the original block to cover the range [ index, original-block-size ). Block addresses, content, edges and symbols will be updated as necessary. This utility will be used in upcoming improvements to JITLink's eh-frame support.
* [AArch64] Remove overlapping scheduling definitions (NFC)Evandro Menezes2019-10-301-19/+0
| | | | | | | | | | The scheduling definitions for ASIMD transpose and zipping overlapped with others a few lines below. Somehow, they didn't raise errors before. There seem to be other overlapping definitions. Somehow, they still don't raise errors. Differential revision: https://reviews.llvm.org/D68353
* [PowerPC][AIX] Adds support for writing the data section in object filesjasonliu2019-10-301-1/+8
| | | | | | | | | | | | | | | | | | | | Adds support for generating the XCOFF data section in object files for global variables with initialization. Merged aix-xcoff-common.ll into aix-xcoff-data.ll. Changed variable name charr to chrarray in the test case to test if readobj works with 8-character names. Authored by: xingxue Reviewers: hubert.reinterptrtcast, sfertile, jasonliu, daltenty, Xiangling_L. Reviewed by: hubert.reinterpretcast, sfertile, daltenty. Subscribers: DiggerLin, Wuzish, nemanjai, hiraditya, MaskRay, jsji, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67125
* [DebugInfo][DAG] Distinguish different kinds of location indirectionJeremy Morse2019-10-301-5/+19
| | | | | | | | | | | | | | | | | | From SelectionDAGs point of view, debug variable locations specified with dbg.declare and dbg.addr are indirect -- they specify the address of something. But calling conventions might mean that a Value is placed on the stack somewhere, and this too is indirection. Previously this was mixed up in the "IsIndirect" field of DBG_VALUE insts; this patch separates them by encoding the indirection in a DIExpression. If we have a dbg.declare or dbg.addr, then the expression produces an address that then becomes a DWARF memory location. We can represent this by putting a DW_OP_deref on the _end_ of the expression. If a Value has been placed on the stack, then we need to put a DW_OP_deref on the _start_ of the expression, to load the Value from the stack and have the rest of the expression operate on it. Differential Revision: https://reviews.llvm.org/D69028
* [LegacyPassManager] Delete BasicBlockPass/Manager.Alina Sbirlea2019-10-304-307/+2
| | | | | | | | | | | | | | | | Summary: Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation. The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254. In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager. Reviewers: chandlerc, echristo Subscribers: mehdi_amini, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69121
* [X86] Add FIXME comment to merge more of computeZeroableShuffleElements and ↵Simon Pilgrim2019-10-301-0/+1
| | | | getTargetShuffleAndZeroables
* [X86][SSE] combineX86ShuffleChain - use resolveZeroablesFromTargetShuffle ↵Simon Pilgrim2019-10-301-4/+3
| | | | helper. NFCI.
* [SLP] Vectorize jumbled stores.Alexey Bataev2019-10-301-16/+91
| | | | | | | | | | | | | Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339
* [NFC] Move this set of STRICT_* cases to be next to the non-strict cases.Kevin P. Neal2019-10-301-10/+10
| | | | Requested by Cameron McInally in D69275.
* [AMDGPU] Simplify VCCZ bug handlingJay Foad2019-10-301-5/+1
| | | | | | | | | | | | | | | | | | | | Summary: VCCZBugHandledSet was used to make sure we don't apply the same workaround more than once to a single cbranch instruction, but it's not necessary because the workaround involves inserting an s_waitcnt instruction, which is enough for subsequent iterations to detect that no further workaround is necessary. Also beef up the test case to check that the workaround was only applied once. I have also manually verified that the test still passes even if I hack the big do-while loop in runOnMachineFunction to run a minimum of five iterations. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69621
* [NFC][MachineOutliner] Fix typo in commentDavid Tellenbach2019-10-301-1/+1
|
* Fix pattern error for S2_tstbit_i instructionIkhlas Ajbar2019-10-301-2/+2
| | | | | It used to generate S2_tstbit_i with constant -33 which resulted in an assert. The reason is log2_32 was called with 64bit value 0.
* [SLPVectorizer] Use getAPInt() for comparison. NFCI.Simon Pilgrim2019-10-301-1/+1
| | | | Technically integers can assert on getZExtValue() if beyond i64 range, and a fuzzer usually find this.....
* [AIX] Lowering CPI/JTI/BA to MIRXiangling Liao2019-10-301-6/+6
| | | | | | Enable lowering of constant pool index, jump table index, and bloack address to MIR on AIX. Differential Revision: https://reviews.llvm.org/D69264
* [AArch64][MachineOutliner] Return address signing for outlined functionsDavid Tellenbach2019-10-301-7/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: During AArch64 frame lowering instructions to enable return address signing are inserted into function if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 2. If all candidate functions agree on the signing scope, signing key and and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: olista01, paquette, t.p.northover, ostannard Reviewed By: ostannard Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69097
* [SelectionDAG] Add support for FP_ROUND in WidenVectorOperand.Jay Foad2019-10-301-4/+14
| | | | | | | | | | | | Summary: This is used on AMDGPU for rounding from v3f64 (which is illegal) to v3f32 (which is legal). Subscribers: jvesely, nhaehnle, tpr, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69339
* [IR] Allow fast math flags on calls with floating point array type.Jay Foad2019-10-302-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This extends the rules for when a call instruction is deemed to be an FPMathOperator, which is based on the type of the call (i.e. the return type of the function being called). Previously we only allowed floating-point and vector-of-floating-point types. Now we also allow arrays (nested to any depth) of floating-point and vector-of-floating-point types. This was motivated by llpc, the pipeline compiler for AMD GPUs (https://github.com/GPUOpen-Drivers/llpc). llpc has many math library functions that operate on vectors, typically represented as <4 x float>, and some that operate on matrices, typically represented as [4 x <4 x float>], and it's useful to be able to decorate calls to all of them with fast math flags. Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy Subscribers: wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69161
* LiveIntervals: Split live intervals on multiple dead defsKrzysztof Parzyszek2019-10-301-3/+14
| | | | | | | | | | This is a follow-up to D67448. Split live intervals with multiple dead defs during the initial execution of the live interval analysis, but do it outside of the function createAndComputeVirtRegInterval. Differential Revision: https://reviews.llvm.org/D68666
* minidump: Rename some architecture constantsPavel Labath2019-10-301-1/+1
| | | | | | | | | | | | | | | | | | | | | The architecture enum contains two kinds of contstants: the "official" ones defined by Microsoft, and unofficial constants added by breakpad to cover the architectures not described by the first ones. Up until now, there was no big need to differentiate between the two. However, now that Microsoft has defined https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/ns-sysinfoapi-system_info a constant for ARM64, we have a name clash. This patch renames all breakpad-defined constants with to include the prefix "BP_". This frees up the name "ARM64", which I'll re-introduce with the new "official" value in a follow-up patch. Reviewers: amccarth, clayborg Subscribers: lldb-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D69285
* [ARM][AArch64][DebugInfo] Improve call site instruction interpretationDjordje Todorovic2019-10-306-9/+105
| | | | | | | | | | | | | Extend the describeLoadedValue() with support for target specific ARM and AArch64 instructions interpretation. The patch provides specialization for ADD and SUB operations that include a register and an immediate/offset operand. Some of the instructions can operate with global string addresses or constant pool indexes but such cases are omitted since we currently lack flexible support for processing such operands at DWARF production stage. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67556
* [AArch64][SVE] Implement masked store intrinsicsKerry McLaughlin2019-10-303-1/+67
| | | | | | | | | | | | | | | | Summary: Adds support for codegen of masked stores, with non-truncating and truncating variants. Reviewers: huntergr, greened, dmgreen, rovka, sdesmalen Reviewed By: dmgreen, sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69378
* [X86] combineOrShiftToFunnelShift - use isOperationLegalOrCustom to check ↵Simon Pilgrim2019-10-301-1/+2
| | | | | | FSHL/FSHR support Remove hard wired legality check.
* [X86] combineOrShiftToFunnelShift - use getShiftAmountTy instead of ↵Simon Pilgrim2019-10-301-5/+8
| | | | hardwiring to MVT::i8
* [AArch64][SVE] Implement additional integer arithmetic intrinsicsKerry McLaughlin2019-10-302-16/+28
| | | | | | | | | | | | | | | | | | Summary: Add intrinsics for the following: - sxt[b|h|w] & uxt[b|h|w] - cls & clz - not & cnot Reviewers: huntergr, sdesmalen, dancgr Reviewed By: sdesmalen Subscribers: cameron.mcinally, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69567
* [AMDGPU] Consolidate one more getGeneration checkJay Foad2019-10-301-1/+1
| | | | | This one should have been done in r363902 when hasReadVCCZBug was introduced.
* [Alignment] Use Align for TFI.getStackAlignment() in X86ISelLoweringGuillaume Chatelet2019-10-301-26/+18
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, craig.topper, rnk Reviewed By: rnk Subscribers: rnk, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69034
* [AddressSanitizer] Only instrument globals of default address spaceKarl-Johan Karlsson2019-10-301-0/+2
| | | | | | | | | | | | | The address sanitizer ignore memory accesses from different address spaces, however when instrumenting globals the check for different address spaces is missing. This result in assertion failure. The fault was found in an out of tree target. The patch skip all globals of non default address space. Reviewed By: leonardchan, vitalybuka Differential Revision: https://reviews.llvm.org/D68790
* [PowerPC] Clear the sideeffect bit for those instructions that didn't have ↵QingShan Zhang2019-10-303-6/+8
| | | | | | | | | | | | the match pattern If the instruction have match pattern, llvm-tblgen will infer the sideeffect bit from the match pattern and it works well. If not, the tblgen will set it as true that hurt the scheduling. PowerPC has some instructions that didn't specify the match pattern(i.e. LXSD etc), which is manually selected post-ra according to the register pressure. We need to clear the sideeffect flag for these instructions. Differential Revision: https://reviews.llvm.org/D69232
* [X86] Make memcmp vector lowering handle arbitrary expansionsDavid Zarzycki2019-10-302-25/+45
| | | | | | | | | | Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp expansions but do not change any default policy for now. This also fixes a bug in the memcmp expansion itself when large displacements are needed. https://reviews.llvm.org/D69507
* Break out OrcError and RPCChris Bieneman2019-10-298-8/+34
| | | | | | | | | | | | | | | | | | | Summary: When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more. This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory. With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport. Reviewers: lhames Reviewed By: lhames Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68732
* AMDGPU/GlobalISel: Legalize FDIV32Austin Kerbow2019-10-292-0/+101
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69581
* [LLD][ELF] Support --[no-]mmap-output-file with F_no_mmapNick Terrell2019-10-291-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a flag `F_no_mmap` to `FileOutputBuffer` to support `--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores this flag for compatibility with GNU ld and gold. We need this flag to speed up link time for large binaries in certain scenarios. When we link some of our larger binaries we find that LLD takes 50+ GB of memory, which causes memory pressure. The memory pressure causes the VM to flush dirty pages of the output file to disk. This is normally okay, since we should be flushing cold pages. However, when using BtrFS with compression we need to write 128KB at a time when we flush a page. If any page in that 128KB block is written again, then it must be flushed a second time, and so on. Since LLD doesn't write sequentially this causes write amplification. The same 128KB block will end up being flushed multiple times, causing the linker to many times more IO than necessary. We've observed 3-5x faster builds with -no-mmap-output-file when we hit this scenario. The bad scenario only applies to compressed filesystems, which group together multiple pages into a single compressed block. I've tested BtrFS, but the problem will be present for any compressed filesystem on Linux, since it is caused by the VM. Silently ignoring --no-mmap-output-file caused a silent regression when we switched from gold to lld. We pass --no-mmap-output-file to fix this edge case, but since lld silently ignored the flag we didn't realize it wasn't being respected. Benchmark building a 9 GB binary that exposes this edge case. I linked 3 times with --mmap-output-file and 3 times with --no-mmap-output-file and took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM, BtrFS mounted with -compress-force=zstd, and an 80% full disk. | Mode | Time | |---------|-------| | mmap | 894 s | | no mmap | 126 s | When compression is disabled, BtrFS performs just as well with and without mmap on this benchmark. I was unable to reproduce the regression with any binaries in lld-speed-test. Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D69294
* [DWARF5] Added support for deleted C++ special member functions.Adrian Prantl2019-10-292-0/+18
| | | | | | | | | | This patch adds support for deleted C++ special member functions in clang and llvm. Also added Defaulted member encodings for future support for defaulted member functions. Patch by Sourabh Singh Tomar! Differential Revision: https://reviews.llvm.org/D69215
* [SelectionDAG] Enable lowering unordered atomics loads w/LoadSDNode (and ↵Philip Reames2019-10-293-1/+28
| | | | | | | | | | stores w/StoreSDNode) by default Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other. This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization. Differential Revision: https://reviews.llvm.org/D69219
* [X86] Narrow i64 compares with constant to i32 when the upper 32-bits are ↵Craig Topper2019-10-291-5/+17
| | | | | | | | | | | | | | | | known zero. This catches some cases. There are probably ways to improve this. I tried doing it as a combine on the setcc, but that broke some cases involving flag reuse in place of test. I renamed the isX86CCUnsigned to isX86CCSigned and flipped its polarity to make it consistent with the similar functions for ISD::SETCC. This avoids calling EQ/NE as being signed or unsigned. Fixes PR43823. Differential Revision: https://reviews.llvm.org/D69499
* [SVE][AArch64] Adding pattern matching for some SVE instructions.Ehsan Amiri2019-10-291-4/+4
| | | | | | | | | Adding patten matching for two SVE intrinsics: frecps and frsqrts. Also added patterns for fsub and fmul - these SDNodes directly correspond to machine instructions. Review: https://reviews.llvm.org/D68476 Patch authored by mgudim (Mikhail Gudim).
* [SLP] Fix -Wunused-variable. NFCFangrui Song2019-10-291-2/+1
|
* Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)Sander de Smalen2019-10-297-8/+81
| | | | | | | | | | | | | | | | llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with EXPENSIVE_CHECKS enabled, causing the patch to be reverted in rG2c496bb5309c972d59b11f05aee4782ddc087e71. This patch relands the patch with a proper fix to the live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the callee-saves correctly so that the CalleeSaved info is taken from MIR directly, rather than letting it be recalculated by the PEI pass. I've done this by running `llc -stop-before=prologepilog` on the LLVM IR as captured in the test files, adding the extra MOV instructions that were manually added in the original test file, then running `llc -run-pass=prologepilog` and finally re-added the comments for the MOV instructions.
* [SLP] Generalization of stores vectorization.Alexey Bataev2019-10-291-76/+72
| | | | | | | | | Stores are vectorized with maximum vectorization factor of 16. Patch tries to improve the situation and use maximal vectorization factor. Reviewers: spatel, RKSimon, mkuper, hfinkel Differential Revision: https://reviews.llvm.org/D43582
* [X86] Pull out combineOrShiftToFunnelShift helper. NFCI.Simon Pilgrim2019-10-291-51/+64
|
OpenPOWER on IntegriCloud